ruSKELL: Russian Corpus for SKELL

Russian Corpus for SKELL is a Russian corpus specially built up for the Russian SKELL interface (ruSKELL). The corpus does not contain whole documents but only sentences sorted according to their text quality.  This score was computed by the GDEX system.

This corpus consists of texts (99.8 %) that come from the Russian top-level domain .ru, the most frequent web domains are kontrolnaja.ru, news.yandex.ru, alterauto.ru, pressarchive.ru and com.sibpress.ru covering just 0.09 % of all corpus documents.

These sources provide a good example of how Russian is used in everyday, standard, formal and professional contexts almost 1 billion words in more than 68 million sentences.

What is SkELL?

SkELL (Sketch Engine for Language Learning) is a corpus-based interface which helps learners of Russian.

Availability

The corpus is accessible to all users with a subscription plan and site licence members (not to trial users).

VERSION DESCRIPTION
1.0 initial version
1.1 improved word sketch grammar
1.3 improved word sketch grammar and renamed relations to Russian

Valentina, A., Vitalevna, B. O., Малолетняя, А. П., Olga, K., & Vit, B. (2016). RuSkELL: Online Language Learning Tool for Russian Language. In Proceedings of the XVII EURALEX International Congress. Lexicography and Linguistic Diversity (6–10 September 2016) (pp. 292-300). Ivane Javakhishvili Tbilisi State University.

Other Russian corpora

Explore our largest ruTenTen Russian corpus with 14+ billion words.

Use Sketch Engine in minutes

Generating collocations, frequency lists, examples in contexts, n-grams or extracting terms is easy with Sketch Engine. Use our Quick Start Guide to learn it in minutes.