A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.

This Tatar part-of-speech tagset is used in the Tatar Mixed corpus which was annotated by Apertium’s morphological tagger.

(The tags of the previous version of this tagset are enclosed with angle quotes and they are used in the Tatar News corpus.)

An Example of a tag in the CQL concordance search box: [tag2="n:sg:nom"] finds all nouns (‘n’) in the singular form (‘sg’) and nominative case (‘nom’), e.g. республика, кеше (note: please make sure that you use straight double quotation marks). The old notation of Tatar part-of-speech tags used the French angle quotes “<” and “>”. These characters were removed. The tags consist of more tags connected with a colon “:”. (The former notation of the tag above is [tag2=""]).

Part of speech categories = First-level tags

Tag Part of speech Example
abbr Abbreviation Аббревиатура
adj Adjective Прилагательное
adv Adverb Наречие
cm comma ,
cnjadv Adverbial conjunction Наречие-союз
cnjcoo Coordinating conjunction Сочинительный союз
cnjsub Subordinating conjunction Подчинительный союз
cop Copula Копула
det Determiner Детермирнатив
ideo Ideophone Звукоподражательное слово
ij Interjection Междометие
n Noun Существительное
np Proper noun Имя собственное
num Numeral Числительное
post Postposition Послелог
postadv Postadverb Посленаречие
prn Pronoun Местоимение
sent sentence marker . ? !
v Verb Глагол
vaux Auxiliary verb Вспомогательный глагол
apos apostrophe
guio hyphen
lpar left Parenthetical marker (
lquot left Quote marker “, «
mod_ass Assertive modal particle бит
mod_ind Indefinite modal particle (expresses doubt) дыр
qst Modal question particle микән
rpar right Parenthetical marker )
rquot right Quote marker ”, »

Proper noun types

Tag Description
top Toponym
ant Anthroponym
cog Cognomen
pat Patronym
org Organization
al Other

Gender of

m Masculine
f Feminine
mf Masculine/feminine; basically cognoms without -ов/-ова,
-ин/-ина endings

“Syntactic” tags. Attributive use of non-adjectives etc.

attr Attributive
subst Substantive
advl Adverbial

Number

sg Singular
pl Plural
sp Singular/Plural

Possessives

px1sg First person singular
px2sg Second person singular
px3sp Third person singular/plural
px1pl First person plural
px2pl Second person plural
px3pl Third person plural (for
reflexive)
px General possessive

Cases

nom Nominative
gen Genitive
dat Dative
acc Accusative
abl Ablative
loc Locative
ref Reflexive

some additional ~cases

sim Similative
# DAй (-дай/-дәй, -тай/-тәй)
abe Abessive=Privative
# SIZ (-сыз/-сез) (not used after posessives and cases)
reas not used rigth now, just in case
for
# LIKTAN

Levels of comparison of adj.

comp Comparative

Pronoun types

pers Personal
recip Reciprocal

Pronoun&Determiner types

dem Demonstrative
ind Indefinite
itg Interrogative
qnt Quantifier
neg Negative
(NOTE: also used to denote
negation in verbs, i.e for м{A})
ref Reflexive

Numeral types

ord Ordinal
coll Collective
dist Distibutive

Verbal features

Mood

imp Imperative
opt Optative/jussive
evid Evidential, a.k.a.
“indirect” / non-eyewitness / hearsay

Derivation

caus Causative
pass Passive
coop Cooperative

Tenses / finite forms

pres -{E}
aor
past -{G}{A}н
ifi -{D}{I}
fut -{I}р
fut2 -{A}ч{A}к
fut_plan -м{A}кч{I}

Non-Finite verb forms

Participles

prc_perf Perfect participle
-{I}п
# “_Йоклап_ яткан мәче авызына тычкан үзе _килеп_ керми.”;
prc_impf Imperfect participle
-{E}
# ул _уйный_ алмады; мин _яза_ башладым;
prc_vol Volition participle
{E}с{I}
# _эчәсем_ килә;
prc_cond Conditional participle
-с{A}
# “…моны _алсаң_ була…” (“ала аласың”
мәгънәсендә);
prc_fplan Future plan participle
-м{A}кч{I}
# “Бакчага бармакчы идем.”;

Verbal adverbs

gna_perf -{I}п
# “…ул вакытта инде кояш _баеп_, йолдызлар күренә башлаган
# иде…” (Ф.Хөсни);
gna_cond -с{A}
# “…кайда икәнен _белсә_, миңа моның турында сөйләр иде…”;
gna_until -{G}{A}нч{I}
(name covers only the temporal
meaning of it, form has more)
# “Авылның басу капкасына _җиткәнче_ эңгер-меңгердә карлы юлдан
# озак кайта ул.”;
gna_after -{G}{A}ч
(name covers only the temporal
meaning of it, form has more)
# “Берәү, патша йортын күреп _кайткач_, үз өенә ут төрткән,
ди.”;

Verbal adjectives

gpr_past -{G}{A}н
# килгән кеше;укылмаган китап;
gpr_impf -{A} торган
TODO: this is equivalent of Kazakh
; compound forms
should be handled in transfer, so
check once more, whether
there is a real reason not to
handle it there (it seemed so)
gpr_pot {U}ч{I}
# сөйләүче кеше;үз урынын белмәүче;
gpr_ppot -{I}рл{I}к/-{A}рл{I}к
gpr_fut -{I}р/-{A}р
# барыр җир; сөйләр сүз;
gpr_fut2 -{A}ч{A}к
# әйтеләчәк фикер; эшләнәчәк эш;
gpr_fut3 {E}с{I}
(NOTE: ambigious with the volition
participle (see above))
# “_Үләсе_ күбәләк ут күзенә керер”;

Gerunds (verbal nouns)

ger -{U}
ger_past -{G}{A}н
ger_perf -{G}{A}нл{I}к
(stresses the fact that something
happened)
# “Барысының да мәсьәләне үз башында _йөрткәнлеге_, теге яктан
# да, бу яктан да үлчәп _караганлыгы_ сизелеп тора.” (Ф. Хөсни);
ger_ppot -{I}рл{I}к/-{A}рл{I}к
(~the ability to do the denoted
action)
ger_abs -{U}ч{I}л{I}к FIXME CHECK
This form shouldn’t be that
productive in Tatar, consider adding
them as nouns if they appear in
corpus. Kazakh is
translated with <ger< td=””> </ger<>
ger_fut -{I}р/-{A}р
# “Ярым кем _булырын_ белмим, мин әле ялгыз йөрим.” (Ш.Галиев);
ger_fut2 -{A}ч{A}к
ger_fut3 {E}с{I}
(NOTE: ambigious with the volition
participle (see above))
# “_Күрәселәре_ алда әле”;
ger1 -м{A}к
inf -{A/I}рг{A}

Transitivity

tv Transitive
iv  intransitive

Person

p1 First person
p2 Second person
p3 Third person
frm Formality

Modal particles

qst Modal
question particle
# м{I}
emph Emphasizing modal particle
# -ч{I}, -с{A}н{A}
mod_ass Assertive modal particle
mod_ind Indefinite modal particle
(expresses doubt)

Punctuation mark

sent  Sentence marker
guio  Hyphen
cm  Comma
apos  Apostrophe
rquot  Quote marker (right hand side)
lquot  Quote marker (left hand side)
rpar  Parenthetical marker (right hand side)
lpar  Parenthetical marker (left hand side)

Source: http://corpus.tatar/index_en.php?openinframe=manuals/tags_uniq.pdf
Download all Tatar POS tagset in the old format with “<” and “>” as XLS (excel format) or TXT (text file format).