A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.
MGNN Tagalog part of speech tagset
MGNN Tagalog part of speech tagset is available in Tagalog corpora annotated by Stanford parser with using the Filipino tagger model. Stanford parser was created by The Natural Language Processing Group at Stanford University.
The following table shows English Penn TreeBank part-of-speech tagset including Sketch Engine differences (older version).
An Example of a tag in the CQL concordance search box: [tag="NN.*"]
finds all nouns, e.g. papel, Pilipino, (note: please make sure that you use straight double quotation marks)
POS Tag | Description | Example |
NN.* | Noun (Pangngalan) | |
NNC | Common Noun (count noun) | papel, tao, pag- + verb, kapalit |
NNP | Proper Noun NNP Pilipino, Lasaliano | Pilipino, Lasaliano |
NNPA | Proper Noun Abbreviation NNPA Dra., Bb., G., Kgg., etc. | Dra., Bb., G., Kgg., etc. |
NNCA | Common Noun Abbreviation NNCA in, km, m, cm, measurements, et al, etc. | in, km, m, cm, measurements, et al, etc. |
PR.* | Pronoun (Panghalip) | |
PRS | as Subject (Palagyo)/Personal Pronouns Singular | ako, ikaw, ka, siya, ko, mo, niya, kita, nya |
PRP | Personal Pronouns (Plural) | kami, tayo, kayo, sila, nila, naming, natin, ninyo |
PRSP | Possessive Subject (Paari) | akin, iyo, kanya, amin, atin, inyo, kanila |
PRO | Pointing to an Object Demonstrative/(Paturol/Pamatlig) | ito, iyan, iyon, iri/e, niyan, niyon/noon, nito, naroon,
nariyan, yaon |
PRQ | Question/Interrogative (Pananong)/Singular | sino, saan, alin, ilan, gaano, kanino, magkano |
PRQP | Question/Interrogative Plural | sinu-sino, saan-saan, alin-alin, ilan-ilan, gaa-gaano,
kani-kanino, etc. |
PRL | Location (Panlunan) | dito, doon, diyan, riyan, roon, rito, nandito, etc. |
PRC | Comparison (Panulad) | ganyan, ganito |
PRF | Found (Pahimaton) | ayto, heto, hayan, ayon, yun, hayun |
PRI | Indefinite | kuwan, iba, kapwa, isa, lahat, marami, kaunti, sinuman, alinman, anuman, kalahatan, kabuuan |
DT.* | Determiner (Pantukoy) | |
POS | possessive ending | friend’s |
PP | personal pronoun | I, he, it |
PP$ | possessive pronoun | my, his |
RB | adverb | however, usually, naturally, here, good |
LM | Lexical Marker | ay |
CC.* | Conjunctions (Pang-ugnay) | |
CCT | o, saka, ni-, maging, pero, subalit, ngunit, bagkus, kundi, imbes, kahit, halip, maliban, sa, sa pamamagitan ng, bilang, bagamat, datapwat, samantala, habang, para | |
CCR | kaya (tuloy), kaya (nga ba), kaya (ngayon), kasi, dahil sa, dahil kasi, kung dangan kasi, papaano kasi, sapagkat, kasi, dahilan sa, palibhasa | |
CCB | at saka, at gayon din, at…rin, kasama, upang, ng, nanggayundin, palibhasa, sa sandaling, basta’t | |
CCA | at, pati | |
CCP | Ligatures (Pang-angkop) | na, -ng, -g |
CCU | Preposition (Pang-ukol) | laban sa, dagdag pa |
VB.* | Verb (Pandiwa) | |
VBW | Neutral/Infinitive | mag-, ma-, mang-, sana, sabi, ka- + verb, mapag- +verb, makipag- + verb, maging |
VBS | Auxiliary, Modal/Pseudo-verbs | kailangan, pwede, dapat, maari, gusto, ayaw, ibig, nais |
VBH | Existential | mayroon, meron, may |
VBN | Non-existential | wala, ala |
VBTS | Time Past (Perfective) | nahulog, kumain, pinaalis, nag-, naging |
VBTR | Time Present (Imperfective) | nahuhulog, kumakain, pinapaalis, nagiging |
VBTF | Time Future (Contemplative) | mahuhulog, kakain, papaalisin, magiging |
VBTP | Recent past | kahuhulog, kakakain, kapapaalis |
VBAF | Actor Focus | -um-, mag-, ma-, mang- |
VBOF | Object/Goal Focus | -in, -an, i- |
VBOB | Benefactive Focus | i-, ipag- |
VBOL | Locative Focus | -an, -in, pag…an |
VBOI | Instrumental Focus | ipang- |
VBRF | Referential/Measurement Focus | pinag- |
JJ.* | Adjective (Pang-uri) | |
JJD | Describing (Panlarawan) | maganda, mabait, buo, masyado, bawat |
JJC | Used for Comparison (same level) (Pahambing Magkatulad) |
sing-, kasing, kapwa, pareho, magsing, magkasing, gangga, ga, tulad ng, gaya ng, kaysa sa |
JJCC | Comparison Comparative (more) (Palamang) |
mas, medyo, higit, lalo, lalong |
JJCS | Comparison Superlative (most) (Pasukdol) |
pinaka-, ubod, sakdal, ulo, labis, hari |
JJCN | Comparison Negation (not quite) (Di-Magkatulad) |
di-gasinong, di-gaano |
JJN | Describing Number (Pamilang) | tatlong, labinlima |
RB.* | Adverb (Pang-Abay) | |
RBD | Describing “How” (Pamaraan) | mabilis na tumakbo, masayang umuwi, pa + verb,sabay, naka- + verb |
RBN | Number (Panggaano/Panukat) | nang limang libra, + apat na guhit |
RBK | Conditional (Kondisyunal) | kung, sakali, pagka, kapag, pag |
RBP | Causative (Pananhi) | dahil sa, dahil dito, kaya |
RBB | Benefactive (Benepaktibo) | para sa, para kay |
RBR | Referential (Pangkaukulan) | tungkol sa, ukol, hinggil, patungkol, ayon sa, ukol sa,hinggil sa, alinsunod sa, sabi ni, wika ni, tanong ni |
RBQ | Question (Pananong) | bakit, paano, baga, kaya, gaano |
RBT | Agree (Panang-ayon) | talaga, oo, tunay, mangyari, opo, oho, siyanga pala,sadya, maaaring, totoo |
RBF | Disagree (Pananggi) | hindi nga, hinding-hindi, walang, huwag, ewan,aywan, ayaw, malay, wag, ayoko |
RBW | Frequency (Pamanahon) | tuwing, muli, ngayon, laging, pagkatapos, noon,mamaya, parati, bihira, bago, uli, sandali, minsan,samantala, habang, kapag, buhat, mula ng, umpisa,hanggang, kahapon, kanina, bukas, araw-araw, galing |
RBM | Possibility (Pang-agam) | baka, tila, marahil, yata, siguro, wari, malamang,maaaring |
RBL | Place (Panlunan) | kina Thelma, nasa, sa + bahay, amin, ilalim, likod,itaas, harap, mula sa, kinaroroonan, tungo sa |
RBI | Enclitics (Paningit) | na, pa, rin, din, man, muna, kaya, naman, sana, yata,ba, nga, daw, raw, kasi, lang, lamang, pala, tuloy |
RBJ | Interjections (Sambitla) | hoy, aba, ay, aray, naku, ha |
RBS | Social Formula (Pormularyong Panlipunan) | Tao po!, Magandang umaga! Mano po., Salamat po., Pasensya na po., Sori po. |
CD.* | Cardinal Number (Bilang) | |
CDB | Digit, Rank, Count | 1, una, tatlo, II |
TS | Topicless (Walang Paksa) | Umuulan., Alas dos na., May tao., Ang tapang mo pala. |
FW | Foreign Words | English, Spanish, Latin |
PM.* | Punctuation (Pananda) | |
PMP | Period | “.” |
PME | Exclamation Point | “!” |
PMQ | Question Mark | “?” |
PMC | Comma | “,” |
PMSC | Semi-colon | “;” |
PMS | Symbols | “@, /, +, *, ( , ), “,’, ~, &, %, $, #, =, -, :” |
Compound | Tags | __…_ | ||
CCB_CCP | JJCS_VBRF_CCP | PRI_CCT | RBL_JJD | VBS_CCP |
CCR_CCA | JJCS_VBTR | PRI_LM | RBL_JJD_CCP | VBTF_CCP |
CCR_CCB | JJCS_VBTR_VBOF | PRL_CCP | RBL_NNC | VBTF_JJD |
CCR_CCP | JJCS_VBTR_VBRF | PRL_LM | RBL_NNP | VBTF_VBAF |
CCR_LM | JJCS_VBTS | PRO_CCB | RBL_NNPA | VBTF_VBOB |
CCT_CCA | JJCS_VBW | PRO_CCP | RBL_NNP_NNP | VBTF_VBOF |
CCT_CCP | JJC_CCB | PRO_LM | RBL_PRL | VBTF_VBOF_CCP |
CCT_LM | JJC_CCP | PRP_CCB | RBM_CCP | VBTR_CCP |
CCU_DTP | JJC_JJD | PRP_CCP | RBM_LM | VBTR_VBAF |
CDB_CCA | JJC_PRL | PRP_LM | RBN_CCP | VBTR_VBAF_CCP |
CDB_CCP | JJD_CCA | PRQ_CCP | RBP_CCP | VBTR_VBOB |
CDB_LM | JJD_CCB | PRQ_LM | RBQ_CCB | VBTR_VBOF |
CDB_NNC | JJD_CCP | PRSP_CCP | RBQ_CCP | VBTR_VBOF_CCP |
CDB_NNC_CCP | JJD_CCT | PRS_CCB | RBQ_LM | VBTR_VBRF |
JJCC_CCP | JJD_NNC | PRS_CCP | RBR_DTP | VBTR_VBRF_CCP |
JJCC_JJD | JJD_NNP | PRS_LM | RBS_CCP | VBTS_CCA |
JJCN_CCP | JJN_CCA | RBD_CCB | RBT_CCB | VBTS_CCP |
JJCN_LM | JJN_CCB | RBD_CCP | RBT_CCP | VBTS_JJD |
JJCS_CCB | JJN_CCP | RBD_LM | RBT_CCT | VBTS_LM |
JJCS_CCP | JJN_NNC | RBF_CCP | RBT_LM | VBTS_VBAF |
JJCS_JJC | JJN_NNC_CCP | RBF_JJD | RBW_CCA | VBTS_VBOB |
JJCS_JJC_CCP | NNC_CCA | RBF_JJD_CCP | RBW_CCB | VBTS_VBOF |
JJCS_JJD | NNC_CCB | RBF_LM | RBW_CCP | VBTS_VBOF_CCP |
JJCS_JJD_CCB | NNC_CCP | RBF_RBW | RBW_DTP | VBTS_VBOL |
JJCS_JJD_CCP | NNC_LM | RBF_VBTR | RBW_LM | VBTS_VBRF |
JJCS_JJD_NNC | NNC_PMC | RBF_VBW_CCP | RBW_RBI | VBW_CCB |
JJCS_JJN | NNP_CCA | RBI_CCA | VBAF_CCP | VBW_CCP |
JJCS_JJN_CCP | NNP_CCP | RBI_CCP | VBH_CCB | VBW_CDB |
JJCS_RBF | PRC_CCB | RBI_LM | VBH_CCP | VBW_LM |
JJCS_VBAF | PRC_CCP | RBJ_CCP | VBN_CCP | |
JJCS_VBAF_CCP | PRI_CCB | RBK_LM | VBOB_CCP | |
JJCS_VBN_CCP | PRI_CCP | RBL_CCP | VBOF_CCP | |
JJCS_VBOF | PRI_CCP_NNP | RBL_CCP_NNP | VBRF_CCP |
Main differences to default MGNN tagset
The list of PoS tags with their descriptions and examples was gained from https://drive.google.com/file/d/0B7lapk7DR3X4cHF5M3gxM1pCdGs/view
Bibliography
Tagset used at Nocon, N. and Borra, A.’s “SMTPOST: Using Statistical Machine Translation Approach
in Filipino Part-of-Speech Tagging” (2016) from De La Salle University, Manila, Philippines.(https://www.aclweb.org/anthology/Y/Y16/Y16-3010.pdf)
or