A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.

Dutch TreeTagger part-of-speech tagset is available in Dutch corpora annotated by the TreeTagger tool that was developed by Helmut Schmid in the TC project at the Institute for Computational Linguistics of the University of Stuttgart.

This is the default Dutch TreeTagger tagset with additional tag det__excl for exclamative determiner.

An Example of a tag in the CQL concordance search box[tag="num.*"] finds all numerals, e.g. 100, eerste (note: please make sure that you use straight double quotation marks)


$. punctuation (e.g. : !)
adj adjective (e.g. jaarlijks ‘yearly’)
adj*kop adjective plus hyphen (part of word) (e.g. gezins- en famielieleden)
adjabbr adjective abbreviation (e.g. max. ‘max.’)
adv adverb (e.g. daar ‘there’)
advabbr adverb abbreviation (e.g. bv. ‘e.g.’, enz. ‘etc.’)
conjcoord coordinate conjunction (e.g. en ‘and’)
conjsubo subordinate conjunction (e.g. als ‘as’)
det__art determiner article (e.g. het ‘the’)
det__demo demonstrative determiner (e.g. die ‘those’)
det__excl exclamative determiner (e.g. zulk, zo’n ‘such’)
det__indef indefinite determiner (e.g. iedere ‘each’, wat ‘a bit’, enige ‘some’)
det__poss possessive determiner (e.g. hun ‘their’)
det__quest interrogative determiner (e.g. welke ‘which’)
det__rel reflexive determiner (e.g. wiens ‘whose’)
int interjection (e.g. hoera ‘hurrah’)
noun*kop noun plus hyphen (part of word) (e.g. kinder- en jeugdboeken ‘children’s and youth books’)
nounabbr noun abbreviation (e.g. BTW ‘VAT’)
nounpl common noun plural (e.g. mensen ‘humans’)
nounprop proper noun (e.g. Soedan ‘Sudan’)
nounsg common noun singular (e.g. mens ‘human’)
num__card numeral cardinal (e.g. 128, 12-12-2009)
num__ord numeral ordinal (e.g derde ‘third’)
partte particle (e.g. te ‘to’)
prep preposition (e.g. aan ‘at’)
prepabbr preposition abbreviation (e.g. d.d. ‘dd.’)
pronadv adverbial pronoun (e.g. er, daarmee ‘with that’)
prondemo demonstrative pronoun (e.g. zelf ‘self’)
pronindef indefinite pronoun (e.g. sommigen ‘some’)
pronpers personal pronoun (e.g. hij ‘he’)
pronposs possessive pronoun (e.g. z’n ‘his’, m’n ‘mine’)
pronquest interrogative pronoun (wh-pronoun) (e.g. wie ‘who’, wat ‘what’)
pronrefl reflexive pronoun (e.g. zich ‘X-self’, elkaar ‘eachother’)
pronrel relative pronoun (e.g. wat ‘what’)
punc punctuation (e.g. ,)
verbinf verb infinitive (e.g. doen ‘do’)
verbpapa verb past participle (e.g. geschilderd ‘painted’)
verbpastpl verb past tense plural (e.g. konden ‘could’)
verbpastsg verb past singular (e.g. dook ‘dived’)
verbpresp verb present participle (e.g. lachend ‘laughing’)
verbprespl verb present tense plural (e.g. zitten ‘sit’)
verbpressg verb present tense singular (e.g. zit ‘sit’)

Source: http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/dutch-tagset.txt

Dutch text corpora in Sketch Engine

Sketch Engine offers dozens of Dutch language corpora.