Japanese part-of-speech tagset

A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.
When creating user corpora, the recommended tagset is always preselected. Using a different tagset is only recommended for advanced users. Tagsets cannot be normally changed for preloaded corpora.

Since Word Sketches, thesaurus, term extraction and trends make use of POS tagging, their respective settings (e.g. Word Sketch grammar, term grammar) must be based on the same tagset as the one used in the corpus.

Japanese

available corpora

Japanese corpora in Sketch Engine can have these tagsets:

(to check the tagset used, go to Corpus Statistics and details page)

Japanese MeCab part-of-speech tagset
Japanese ChaSen part-of-speech tagset – English translation

Japanese corpora in Sketch Engine can have these tagsets:

Japanese MeCab part-of-speech tagset

Japanese ChaSen part-of-speech tagset – English translation

for learners of languages

A Course in Lexicography and Lexical Computing

term extraction

learn sketch engine