A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.
SWATWOL part-of-speech tagset is available in Swahili (also known as Kiswahili) corpora.
An Example of a tag in the CQL concordance search box: [tag="N"] finds all nouns, e.g. watu, mwaka (note: please make sure that you use straight double quotation marks)
Tagset
| Tag | Description | Example |
| A-INFL | inflecting adjective | viti vizuri |
| A-UINFL | uninflecting adjective | viti haba |
| ABBR | abbreviation | n.k. |
| AD-ADJ | qualifier of an adjective | kabisa, sana |
| ADJ | adjective | zuri, haba |
| ADJ-INFL | inflecting adjective | zingine |
| ADV | adverb | tena |
| AG-PART | agentive particle | ililiwa na panya |
| AG-PART_PRON | agentive pronoun | nayo |
| AR | Arabic origin | Kitaalam |
| CC | coordinating conjunction | baba na mama |
| CLB | clause boundary | . ; ? ! ili, kwamba |
| COLON | colon | : |
| COMMA | comma | , |
| CONCORD1 | grammatical concord | kupi |
| CONJ | conjunction | ili |
| DEF-V:li | verb with no inflection based on the root “li” | walio, alye |
| DEF-V:na | verb with no inflection based on the root “na” | kuna |
| DEF-V:ni | verb with no inflection based on the root “ni” | ni |
| DEM:hV | demonstrative pronoun of the type hV(C)V | hiki, haya, hii |
| DOLLAR-SIGN | dollar sign | $ |
| DOUBLE-QUOTE | double quote | “ |
| EMPH | emphatic form | mimi ndimi |
| ENG | English origin | Apartheid |
| EQUAL-MARK | equal mark | = |
| EXCLAM | number | 2000 |
| EXCLAMATION | exclamation mark | ! |
| GEN-CON | genitive connector | kitabu cha mtoto |
| HYPHEN | hyphen | – |
| IDIOM | idiom | punde si punde |
| IMP | imperative | twende |
| INTERROG | interrogative | lini? mbona? |
| LEFT-PARENTHESIS | left parenthesis | ( |
| LEFT-SQUAREBRACKET | left square bracket | [ |
| N | noun | kitu |
| NA-POSS | possessive particle | alikuwa na mali |
| NEG | negative | sisomi, hasomi, hatasoma |
| NUM | numeral | ishirini |
| PL1-SP | noun class 1/2 1p plural, subject prefix | mnasoma |
| PL2-SP | noun class 1/2 2p plural, subject prefix | mnasoma |
| PREP | preposition | katika |
| PREP_PRON | preposition+pronoun | naye |
| PROCENT-MARK | procent mark | % |
| PRON | pronoun | mimi, hiki |
| PROPNAME | proper name | Ali, Mombasa |
| REL | relative marker in verb | ninayesoma |
| RHET | rhetorical | Unakwenda, sio? |
| RIGHT-PARENTHESIS | right parenthesis | ) |
| RIGHT-SQUAREBRACKET | right square bracket | ] |
| SELFSTANDING | selfstanding subject particle | yu, zi |
| SG1-SP | noun class 1/2 1p singular, subject prefix | ninafikiri |
| SG2-SP | noun class 1/2 2p person singular, subject prefix | unasoma |
| SG3-SP | noun class 1/2 3p person singular, subject prefix | anasoma |
| SINGLE-QUOTE | single quote | ‘ |
| V | verb | kusoma |
| VCAP_ | word starting with capital | Nawe |
| VFIN | finite verb | anasoma |
| VIMP | imperative verb | rudi, angalia |
| VINF | infinite verb | kuwa, kufanya |
| Vkwisha | verb with the marker “kwisha” | kwisha |
Source: http://www.aakkl.helsinki.fi/cameel/corpus/swatags.pdf




