A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.

Lithuanian part-of-speech tagset

Lithuanian part-of-speech tagset is available in Lithuanian annotated corpus by POS tagger tool.

An Example of a tag in the CQL concordance search box: [tag="N.*"] finds all nouns, e.g. Lietuvos, metų (note: please make sure that you use straight double quotation marks)

No. Feature group Category Tag codes
1 Part of Speech
Noun  N
Adjective A
Numeral M
Pronoun P
Verb V
Adverb R
Interjection I
Onomatopoeia O
Particle Q
Preposition S
Conjunction C
Acronym Z
Abbreviation Y
Roman numbers U
Residual X
Stable phrases H
Punctuation mark, symbols T
HTML tag t
2 Noun types proper p
common c
3 Verb
main m
infinitive n
participle p
adverbial participle a
half participle h
adverbial participle2 b
indicative mood i
imperative mood m
subjective mood s
4 Numerals cardinal c
ordinal o
multiple m
collective l
5 Definiteness pronominal p
non-pronominal n
6 Reflexiveness reflexive r
non-reflexive n
7
Type active a
passive p
necessity n
8 Tense present tense p
past tense a
past frequentative case q
future tense f
simple past s
9 Degree positive p
comparative c
superlative s
10 Gender feminine f
masculine m
neuter n
common c
11 Number singular s
plural p
dual d
12 Case nominative n
genitive g
dative d
accusative a
instrumental i
locative l
vocative v
illiative x
13 Person 1st 1
2nd 2
3rd 3
14. Positiveness positive p
negative n
15. Phrases stable phrases with undefined POS H
16. Unknown foreign f
typos t
segmentation error p

Lithuanian text corpora in Sketch Engine

Sketch Engine offers dozens of Lithuanian language corpora.

or