A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.

Basque part-of-speech tagset is available in Basque corpora annotated.

An Example of a tag in the CQL concordance search box[tag="IZE"] finds all nouns, e.g. behar, herri (note: please make sure that you use straight double quotation marks)


ADB adverb
ADI verb
ADJ adjective
ADL auxiliary verb
ADT synthetic verb
AUR prefix word
BST other
DET determiner
ERL relational suffix (subordination)
HAOS multiword component
IOR pronoun
ITJ interjection
IZE noun
LAB abbreviation
LOT conjunction
PRT particle
SIG acronym
SNB symbol

Basque corpus in Sketch Engine

Sketch Engine offers 100-million word Basque corpus.