ruSKELL: Russian Corpus for SKELL
Russian Corpus for SKELL is a Russian corpus specially built up for the Russian SKELL interface (ruSKELL) available at skell.sketchengine.eu. The corpus does not contain whole documents but only sentences sorted according to their text quality. This score was computed by the GDEX system.
This corpus consists of texts (99.8 %) that come from the Russian top-level domain .ru, the most frequent web domains are kontrolnaja.ru, news.yandex.ru, alterauto.ru, pressarchive.ru and com.sibpress.ru covering just 0.09 % of all corpus documents.
These sources provide a good example of how Russian is used in everyday, standard, formal and professional context almost 1 billion words in more than 68 million sentences.