After improving tools for processing Danish, we are coming with a new version of the Danish Corpus from the web. Texts in this 2-billion-word Danish corpus were downloaded in December 2017.

In comparison with the previous version, we used an improved algorithm for removing similar foreign languages, such as Norwegian (Bokmal or Nynorsk), Swedish, English or German.

Announcement of adding Belarusian corpus to Sketch Engine
Screenshot of word sketch from frTenTen French corpus
Amharic corpus