A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.

Latin corpora in Sketch Engine annotated by the TreeTagger tool (developed by Helmut Schmid in the TC project at the Institute for Computational Linguistics of the University of Stuttgart) use this Lamap part-of-speech tagset.

An Example of a tag in the CQL concordance search box[tag="N"] finds all nouns, e.g. nihil, cause (note: please make sure that you use straight double quotation marks).


TAG Part of speech Example
N noun salus
V verb sapere
PUN punctuation ,
ADJ adjective Gregorii
C Conjunction et
ADV adverb forsitan
PRE preposition ab
PRO pronoun te
P participle dictum
NUM numeral 3, duos
E eclamation Ma
NONLAT nonlatin word
SUPINE supine (verb) petitum

Source: http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/Lamap-Tagset.pdf (tagset used in Sketch Engine is simplified)