Unsupported language | Sketch Engine

Sketch Engine can handle texts in any language including languages which are not supported explicitly. The number of available features depends on the script the language uses. Scripts divide into whitespace scripts and non-whitespace ones.

A whitespace script separates words with a space, paragraph or a similar character appearing as white space on the screen or in print. Typical examples are languages written in Latin, Cyrillic or Arabic scripts. Many scripts of India also belong to this category.

A non-whitespace script is a script that does not use whitespace, typical examples are Chinese, Japanese or Thai. Texts in these scripts transliterated into whitespace scripts can make use of the same functionality as whitespace scripts.

Available features

	whitespace script	non-whitespace script
automatic tokenization	YES, with a universal tokenizer	NO *)
automatic lemmatization	NO *)	NO *)
automatic POS tagging	NO *)	NO *)
concordance search	YES at word level or character level, regex allowed NO *) lemma search or POS search	YES but only at character level, regex allowed, a concordance for a string of characters can be generated, no other searches are available *)
collocations	YES can be calculated from a concordance or via word sketches	NO *)
word lists	YES	NO *)
Word Sketch	YES, universal word sketch grammar will be used, users can write their own word sketch grammar to suit their needs	NO *)
thesaurus	YES	NO *)
Create corpus from the web	YES **)	YES **)

*) If the user uploads a vertical text which is already tokenized, lemmatized and POS-tagged using external tools, all features will be supported.

**) more information about creating a corpus in an unsupported language on the corpus building page.

Is Sketch Engine suitable for my language?

The best way is to set up a free trial access, upload the texts and try how searching your corpus works. Please get in touch if you have any questions related to unsupported languages. Our support team will do their best to accommodate your requirements.

Available features

Is Sketch Engine suitable for my language?

for learners of languages

A Course in Lexicography and Lexical Computing

term extraction

learn sketch engine

Create corpus in an unsupported language

Available features

Is Sketch Engine suitable for my language?

for learners of languages

A Course in Lexicography and Lexical Computing

term extraction

learn sketch engine