A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.

Tagset used by the system MADA  (Morphological Analysis and Disambiguation for Arabic) is available in Arabic corpora annotated tools developed at Center for Computational Learning Systems, Columbia University.

An Example of a tag in the CQL concordance search box: [tag=”noun.*”] finds all nouns, e.g. مع

POS Definition MADA 3.0 MADA 2.32 PATB Equivalent
LABEL pos pos
Nouns noun N NN / NNS
Number Words noun_num N NN / NNS
noun_quant N NN / NNS
Proper Nouns noun_prop PN NNP / NNPS
Adjectives adj AJ JJ
adj_comp AJ JJ
adj_num AJ JJ
Adverbs adv AV RB
adv_interrog Q RP
adv_rel REL WP
Pronouns pron PRO PRP
pron_dem D DT
pron_exclam PRO PRP
pron_interrog Q RP
pron_rel REL WP
Verbs verb V VBN  /  VBP  /  VBD
verb_pseudo V VBN  /  VBP  /  VBD
Particles part P IN
part_det D DT
part_focus P IN
part_fut P IN
part_interrog P IN
part_neg NEG RP
part_restrict P IN
part_verb P IN
part_voc P IN
Prepositions prep P IN
Abbreviations abbrev AB NN
Punctuation punc PX PUNC
Conjunctions conj C CC
conj_sub C CC
Interjections interj IJ UH
Digital Numbers digit NUM CD
Foreign / Latin latin F IN

Source: http://www1.cs.columbia.edu/~rambow/software-downloads/CCLS-10-01.pdf

Arabic corpora in Sketch Engine

Sketch Engine offers dozens of Arabic corpora.