Let us introduce our newest addition to the corpora list: German Web 2023. This corpus includes 16.6 billion words collected from the web, providing a comprehensive representation of the contemporary German language.
sketchengine.eu/detenten-germa
The Belarusian Web 2020 corpus, with 50+ million words and genre & topic classification, is now available in Sketch Engine! https://t.co/3EUtU1zFsq#corpuslinguistics #TextAnalysis pic.twitter.com/ZpviHqtyQ0
— Sketch Engine (@SketchEngine) February 27, 2025
Our TenTen corpora list now includes the Norwegian Web 2023 (#bokmål)! Bokmål is the most used written form of Norwegian today. The corpus is ready to use at https://t.co/GthEQfQtCB #corpuslinguistics #digitalhumanities pic.twitter.com/F9zVOPyX8j
— Sketch Engine (@SketchEngine) February 19, 2025
We don’t want to omit the other Norwegian written standard – Nynorsk! Therefore, the Norwegian Web 2023 (Nynorsk) corpus with 150+ million words is also available in Sketch Engine. 👉 https://t.co/GthEQfQtCB #corpuslinguistics #digitalhumanities pic.twitter.com/hPWhKGVsQN
— Sketch Engine (@SketchEngine) February 24, 2025