A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.

Polish NKJP part-of-speech tagset is available in Polish corpora using grammatical categories from according to the National Corpus of Polish (NKJP).

An Example of a tag in the CQL concordance search box[tag="subst:sg:gen:f"] finds all feminine genitive nouns in singular, e.g. pracy, strony (note: please make sure that you use straight double quotation marks)


Elementary part-of-speech tagset LEGEND
adjective adj.*
adverb adv.*
conjunction conj.*|comp.*
noun subst.*
preposition prep.*
pronoun ppron.*|siebie.*
verb fin.*|bedzie.*|aglt.*|praet.*|impt.*|imps.*|inf.*|pcon.*|pant.*|ger.*|pact.*|ppas.*
punctuation interp
foreign expression xxx
interjection interj
Adjectival subcategories
ad-adjectival adjectives (Polish-German) adja
post-prepositional adjectives (in Polish) adjp
non-inflecting adjectival; adjective after the copula (healthy) adjc

Attributes and their values

Attribute Value
number  sg pl
case  nom gen dat acc inst loc voc
gender  m1 m2 m3 f n
person  pri sec ter
degree  pos com sup
aspect  imperf perf
negation  aff neg
accommodability  congr rec
accentability  akc nakc
post-prepositionality  npraep praep
agglutination  agl nagl
vocalicity  nwok wok
fullstoppedness  pun npun

Part of speeches and their grammatical categories

POS Grammatical categories
adv  [degree]
imps  aspect
inf  aspect
pant  aspect
pcon  aspect
qub  [vocalicity]
prep  case [vocalicity]
siebie  case
subst  number case gender
depr  number case gender
ger  number case gender aspect negation
ppron12  number case gender person [accentability]
ppron3  number case gender person [accentability] [post-prepositionality]
num  number case gender accommodability
numcol  number case gender accommodability
adj  number case gender degree
pact  number case gender aspect negation
ppas  number case gender aspect negation
winien  number gender aspect
praet  number gender aspect [agglutination]
bedzie  number person aspect
fin  number person aspect
impt  number person aspect
aglt  number person aspect vocalicity
brev  fullstoppedness

Source: http://nlp.ipipan.waw.pl/~adamp/Papers/2009-mondilex/article.pdf


PRZEPIÓRKOWSKI, Adam. A comparison of two morphosyntactic tagsets of Polish. In: Representing Semantics in Digital Lexicography: Proceedings of MONDILEX Fourth Open Workshop. Warsaw, 2009. pp. 138–144.