A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.
This is the Russian multilingual MULTEXT-East specifications tagset version 4 that is used in Russian corpora tagged by RFTagger.
These specifications follow the (draft) Version 4 of the multilingual MULTEXT-East specifications, which can be found at http://nl.ijs.si/ME.
The basic idea is that for each major category (Noun, Verb, Adjective, etc) the specifications define a fixed set of attributes (Case, Number, Gender, Animacy, etc), each with its set of values (e.g. masculine, feminine, neuter). Each category-dependent attribute is assigned a position, and each of its values a one letter code, so a complete morphosyntactic description of a word can be encoded by a MorphoSyntactic Descriptions (MSDs). For instance, the attribute-value specification Category = Noun, Type = common, Gender = masculine, Number = singular, Case = accusative, Animate = no corresponds to the MSD Ncmsan. In case a certain attribute is not appropriate for a given combination of features or for a particular lexical item, its code is the hyphen, e.g. Afpns-s, where the case for Adjective qualificative positive neuter singular is undefined, when in the short form.
Therefore, the tag Vmip3p-m-e- is to be interpreted, character by character, as follows:
| V | Category | Verb |
| m | Type | main |
| i | VForm | indicative |
| p | Tense | present |
| 3 | Person | third |
| p | Number | plural |
| – | Gender | – |
| m | Voice | media |
| – | Definiteness | – |
| e | Aspect | perfective |
| – | Case | – |
An Example of a tagin the CQL concordance search box: [tag=”Vmip3p-m-e-“] finds examples like: (сигареты) курятся, (книги) читаются or (фильмы) смотрятся (note: please make sure that you use straight double quotation marks)
Basic overview of Russian tagset
| noun | N.* |
| verb | V.* |
| adjective | A.* |
| pronoun | P.* |
| adverb | R.* |
| adposition | S.* |
| conjunction | C.* |
| numeral | M.* |
| particle | Q.* |
| interjection | I.* |
| abbreviation | Y.* |
| residual | X.* |
Content
Noun
| P | Attribute (en) | Value (en) | Code (en) |
| 0 | CATEGORY | Noun | N |
| 1 | Type | common | c |
| proper | p | ||
| 2 | Gender | masculine | m |
| feminine | f | ||
| neuter | n | ||
| common | c | ||
| 3 | Number | singular | s |
| plural | p | ||
| 4 | Case | nominative | n |
| genitive | g | ||
| dative | d | ||
| accusative | a | ||
| vocative | v | ||
| locative | l | ||
| instrumental | i | ||
| 5 | Animate | no | n |
| yes | y | ||
| 6 | Case2 | partitive | p |
| locative | l |
source: Russian Noun
Verb
| P | Attribute (en) | Value (en) | Code (en) |
| 0 | CATEGORY | Verb | V |
| 1 | Type | main | m |
| auxiliary | a | ||
| 2 | VForm | indicative | i |
| imperative | m | ||
| conditional | c | ||
| infinitive | n | ||
| participle | p | ||
| gerund | g | ||
| 3 | Tense | present | p |
| future | f | ||
| past | s | ||
| 4 | Person | first | 1 |
| second | 2 | ||
| third | 3 | ||
| 5 | Number | singular | s |
| plural | p | ||
| 6 | Gender | masculine | m |
| feminine | f | ||
| neuter | n | ||
| 7 | Voice | active | a |
| passive | p | ||
| media | m | ||
| 8 | Definiteness | shortart | s |
| fullart | f | ||
| 9 | Aspect | progressive | p |
| perfective | e | ||
| biaspectual | b | ||
| 10 | Case | nominative | n |
| genitive | g | ||
| dative | d | ||
| accusative | a | ||
| locative | l | ||
| instrumental | i |
source: Russian Verb
Adjective
| P | Attribute (en) | Value (en) | Code (en) |
| 0 | CATEGORY | Adjective | A |
| 1 | Type | qualificative | f |
| possessive | s | ||
| 2 | Degree | positive | p |
| comparative | c | ||
| superlative | s | ||
| 3 | Gender | masculine | m |
| feminine | f | ||
| neuter | n | ||
| 4 | Number | singular | s |
| plural | p | ||
| 5 | Case | nominative | n |
| genitive | g | ||
| dative | d | ||
| accusative | a | ||
| locative | l | ||
| instrumental | i | ||
| 6 | Definiteness | short-art | s |
| full-art | f |
source: Rusian Adjective
Pronoun
| P | Attribute (en) | Value (en) | Code (en) |
| 0 | CATEGORY | Pronoun | P |
| 1 | Type | personal | p |
| demonstrative | d | ||
| indefinite | i | ||
| possessive | s | ||
| interrogative | q | ||
| relative | r | ||
| reflexive | x | ||
| negative | z | ||
| nonspecific | n | ||
| 2 | Person | first | 1 |
| second | 2 | ||
| third | 3 | ||
| 3 | Gender | masculine | m |
| feminine | f | ||
| neuter | n | ||
| 4 | Number | singular | s |
| plural | p | ||
| 5 | Case | nominative | n |
| genitive | g | ||
| dative | d | ||
| accusative | a | ||
| vocative | v | ||
| locative | l | ||
| instrumental | i | ||
| 6 | Syntactic_Type | nominal | n |
| adjectival | a | ||
| adverbial | r | ||
| 7 | Animate | no | n |
| yes | y |
source: Russsian Pronoun
Adverb
| P | Attribute (en) | Value (en) | Code (en) |
| 0 | CATEGORY | Adverb | R |
| 1 | Degree | positive | p |
| comparative | c | ||
| superlative | s |
source: Russian Adverb
Adposition
| P | Attribute (en) | Value (en) | Code (en) |
| 0 | CATEGORY | Adposition | S |
| 1 | Type | preposition | p |
| 2 | Formation | simple | s |
| compound | c | ||
| 3 | Case | genitive | g |
| dative | d | ||
| accusative | a | ||
| locative | l | ||
| instrumental | i |
source: Russian Adposition
Conjunction
| P | Attribute (en) | Value (en) | Code (en) |
| 0 | CATEGORY | Conjunction | C |
| 1 | Type | coordinating | c |
| subordinating | s | ||
| 2 | Formation | simple | s |
| compound | c | ||
| 3 | Coord_Type | sentence | p |
| words | w | ||
| 4 | Sub_Type | negative | z |
| positive | p |
source: Russian Conjunction
Numeral
| P | Attribute (en) | Value (en) | Code (en) |
| 0 | CATEGORY | Numeral | M |
| 1 | Type | cardinal | c |
| ordinal | o | ||
| multiple | m | ||
| collect | l | ||
| 2 | Gender | masculine | m |
| feminine | f | ||
| neuter | n | ||
| 3 | Number | singular | s |
| plural | p | ||
| 4 | Case | nominative | n |
| genitive | g | ||
| dative | d | ||
| accusative | a | ||
| locative | l | ||
| instrumental | i | ||
| 5 | Form | digit | d |
| roman | r | ||
| letter | l | ||
| 6 | Animate | no | n |
| yes | y |
source: Russian Numeral
Particle
| P | Attribute (en) | Value (en) | Code (en) |
| 0 | CATEGORY | Particle | Q |
| 1 | Formation | simple | s |
| compound | c |
source: Russian Particle
Interjection
| P | Attribute (en) | Value (en) | Code (en) |
| 0 | CATEGORY | Interjection | I |
| 1 | Formation | simple | s |
| compound | c |
source: Russian Interjection
Abbreviation
| P | Attribute (en) | Value (en) | Code (en) |
| 0 | CATEGORY | Abbreviation | Y |
| Syntactic_Type | nominal | n | |
| adverbial | r | ||
| 2 | Gender | masculine | m |
| feminine | f | ||
| neuter | n | ||
| 3 | Number | singular | s |
| plural | p | ||
| paucal | c | ||
| 4 | Case | nominative | n |
| genitive | g | ||
| dative | d | ||
| accusative | a | ||
| locative | l | ||
| instrumental | i |
source: Russian Abbreviation
Residual
| P | Attribute (en) | Value (en) | Code (en) |
| 0 | CATEGORY | Residual | X |
source: Russian Residual
Appendix A Index of Categories
Appendix B Index of Attributes
Appendix C Index of Values
Appendix D Lexical MSDs
(This page was taken from MULTEXT-East Home Page)
or




