A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.

German RFTTagger part-of-speech tagset is available in German corpora that are annotated with the tool RFTagger developed by Helmut Schmid and Florian Laws. See Helmut Schmid and Florian Laws: Estimation of Conditional Probabilities with Decision Trees and an Application to Fine-Grained POS Tagging, COLING 2008, Manchester, Great Britain.

German RFTagger tagset

German POS tags are quite complex:

können
VFIN.Mod.3.Pl.Pres.Ind
weitere
ADJA.Comp.Nom.Pl.Fem
Informationen
N.Reg.Nom.Pl.Fem
abgerufen
VPP.Full.Psp

To see tags, use the concordance and click View options visibility.

Examples

This is how tags should be used in the CQL concordance search on the ADVANCED tab:

[tag=”ADJ.*”]  finds all adjectives, e.g. gut, möglich (make sure you use straight quotation marks)
[tag=”N.Reg.Nom.Pl.Fem”] finds all nouns which are regular (common nouns), in nominative, plural, femenine
[lemma=”Inf.*” & tag=”N.*Pl.*”] finds all nouns in lural starting Inf- (use .* instead of each category for which you do not want to specify concrete criteria)
[lemma=”Tür” & tag=”.*Pl.*”] finds the plural forms of the lemma Tür

For practicality, each grammatical category also has its own attribute which are easier to remember than the tags.

[tag=”N.*” &  Number =”Pl”] finds all nouns in plural
[lemma=”Tür” & Number=”Pl”] finds the plural forms of the lemma Tür
[lemma=”abgerufen” & Tense=”Past”] finds the verb abrufen in the past tense
Tag Description
ADJA.* attributive adjective
Degree Comp Pos Sup
Case Nom Gen Dat Acc * (unspecified)
Number Sg Pl *
Gender Fem Masc Neut *
ADJD.* adjective with predicative or adverbial usage
Degree Comp Pos Sup
ADV.* adverb
APPO.* postposition
Case Nom Gen Dat Acc *
APPR.* preposition
Case Nom Gen Dat Acc *
APPRART.*
preposition with incorporated article
Case Nom Gen Dat Acc *
Number Sg Pl *
Gender Fem Masc Neut *
APZR circumposition (right part)
ART.* article
Type Def (definite) Indef (indefinite)
Case Nom Gen Dat Acc *
Number Sg Pl *
Gender Fem Masc Neut *
CARD cardinal number
CONJ.* conjunction
Type Comp (comparative) Coord (coordinating) SubFin (subordinating with finite clause) SubInf (subordinating with infinitive)
FM foreign word
   
ITJ interjection
   
N.* noun
  Type Name (proper noun) Reg (regular noun, common noun)
Case Nom Gen Dat Acc *
Number Sg Pl *
Gender Fem Masc Neut *
PART.* particle
  Type Ans (answer) Deg (degree) Neg (negation) Zu (“zu” particle) Verb (separated verb particle)
PRO.* pronoun
Type Dem (demonstrative) Indef (indefinite) Inter (interrogative) Pers (personal) Refl (reflexive) Rel (relative) Poss (possessive)
Usage Attr (attributive) Subst (substituting)
 Person  1 2 3 * – (not applicable)
Case Nom Gen Dat Acc *
Number Sg Pl *
Gender Fem Masc Neut *
PROADV.* pronomial adverb
Type Dem (regular: dabei damit) Inter (interrogative or relative: wie wozu)
SYM.* symbol
  Type Pun (punctuation) Quot (quotation) Paren (parenthesis) Other
  Subtype Left Right Colon Comma Sent (sentential punctuation) Hyph (hyphen) Slash Aster (asterisk) Cont (continuation periods) Auth (author) XY
TRUNC.* truncated word form
  POS Adj Noun Verb –
VFIN.* finite verb
Type Aux (auxiliary) Mod (modal) Full (full verb)
Person 1 2 3
Number Pl Sg
Tense Past Pres
Mood Ind (indicative) Subj (subjunctive)
VIMP.* imperative verb
Type Aux (auxiliary) Mod (modal) Full (full verb)
Person 1 2 3
Number Pl Sg
VINF.* infinitival verb
  Type Aux (auxiliary) Mod (modal) Full (full verb)
  Subtype zu (with “zu” infix) – (no infix)
VPP.* participle verb
  Type Aux (auxiliary) Mod (modal) Full (full verb)
  Subtype Prp (present participle) Psp (past participle)

Source: http://www.cis.uni-muenchen.de/~schmid/tools/RFTagger/

German text corpora in Sketch Engine

Sketch Engine offers dozens of German language corpora.

or