A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.

MGNN Tagalog part of speech tagset

Tagalog corpora in Sketch Engine are available with the MGNN Tagalog part-of-speech tagset. This PoS tagset is used by Stanford parser with the Filipino tagger model. Stanford parser was created by The Natural Language Processing Group at Stanford University.

An Example of a tag in the CQL concordance search box: [tag="NN.*"] finds all nouns, e.g. papel, Pilipino,  (note: please make sure that you use straight double quotation marks)

POS Tag Description Example
NN.* Noun (Pangngalan)
NNC Common Noun (count noun) papel, tao, pag- + verb, kapalit
NNP Proper Noun NNP Pilipino, Lasaliano Pilipino, Lasaliano
NNPA Proper Noun Abbreviation NNPA Dra., Bb., G., Kgg., etc. Dra., Bb., G., Kgg., etc.
NNCA Common Noun Abbreviation NNCA in, km, m, cm, measurements, et al, etc. in, km, m, cm, measurements, et al, etc.
PR.* Pronoun (Panghalip)
PRS as Subject (Palagyo)/Personal Pronouns Singular ako, ikaw, ka, siya, ko, mo, niya, kita, nya
PRP Personal Pronouns (Plural) kami, tayo, kayo, sila, nila, naming, natin, ninyo
PRSP Possessive Subject (Paari) akin, iyo, kanya, amin, atin, inyo, kanila
PRO Pointing to an Object Demonstrative/(Paturol/Pamatlig) ito, iyan, iyon, iri/e, niyan, niyon/noon, nito, naroon,

nariyan, yaon

PRQ Question/Interrogative (Pananong)/Singular sino, saan, alin, ilan, gaano, kanino, magkano
PRQP Question/Interrogative Plural sinu-sino, saan-saan, alin-alin, ilan-ilan, gaa-gaano,

kani-kanino, etc.

PRL Location (Panlunan) dito, doon, diyan, riyan, roon, rito, nandito, etc.
PRC Comparison (Panulad) ganyan, ganito
PRF Found (Pahimaton) ayto, heto, hayan, ayon, yun, hayun
PRI Indefinite kuwan, iba, kapwa, isa, lahat, marami, kaunti,
sinuman, alinman, anuman, kalahatan, kabuuan
DT.* Determiner (Pantukoy)
POS possessive ending friend’s
PP personal pronoun I, he, it
PP$ possessive pronoun my, his
RB adverb however, usually, naturally, here, good
LM Lexical Marker ay
CC.* Conjunctions (Pang-ugnay)
CCT o, saka, ni-, maging, pero, subalit, ngunit, bagkus, kundi, imbes, kahit, halip, maliban, sa, sa pamamagitan ng, bilang, bagamat, datapwat, samantala, habang, para
CCR kaya (tuloy), kaya (nga ba), kaya (ngayon), kasi, dahil sa, dahil kasi, kung dangan kasi, papaano kasi, sapagkat, kasi, dahilan sa, palibhasa
CCB at saka, at gayon din, at…rin, kasama, upang, ng, nanggayundin, palibhasa, sa sandaling, basta’t
CCA at, pati
CCP Ligatures (Pang-angkop) na, -ng, -g
CCU Preposition (Pang-ukol) laban sa, dagdag pa
VB.* Verb (Pandiwa)
VBW Neutral/Infinitive mag-, ma-, mang-, sana, sabi, ka- + verb, mapag- +verb, makipag- + verb, maging
VBS Auxiliary, Modal/Pseudo-verbs kailangan, pwede, dapat, maari, gusto, ayaw, ibig, nais
VBH Existential mayroon, meron, may
VBN Non-existential wala, ala
VBTS Time Past (Perfective) nahulog, kumain, pinaalis, nag-, naging
VBTR Time Present (Imperfective) nahuhulog, kumakain, pinapaalis, nagiging
VBTF Time Future (Contemplative) mahuhulog, kakain, papaalisin, magiging
VBTP Recent past kahuhulog, kakakain, kapapaalis
VBAF Actor Focus -um-, mag-, ma-, mang-
VBOF Object/Goal Focus -in, -an, i-
VBOB Benefactive Focus i-, ipag-
VBOL Locative Focus -an, -in, pag…an
VBOI Instrumental Focus ipang-
VBRF Referential/Measurement Focus pinag-
JJ.* Adjective (Pang-uri)
JJD Describing (Panlarawan) maganda, mabait, buo, masyado, bawat
JJC Used for Comparison (same level)
(Pahambing Magkatulad)
sing-, kasing, kapwa, pareho, magsing, magkasing, gangga, ga, tulad ng, gaya ng, kaysa sa
JJCC Comparison Comparative (more)
(Palamang)
mas, medyo, higit, lalo, lalong
JJCS Comparison Superlative (most)
(Pasukdol)
pinaka-, ubod, sakdal, ulo, labis, hari
JJCN Comparison Negation (not quite)
(Di-Magkatulad)
di-gasinong, di-gaano
JJN Describing Number (Pamilang) tatlong, labinlima
RB.* Adverb (Pang-Abay)
RBD Describing “How” (Pamaraan) mabilis na tumakbo, masayang umuwi, pa + verb,sabay, naka- + verb
RBN Number (Panggaano/Panukat) nang limang libra, + apat na guhit
RBK Conditional (Kondisyunal) kung, sakali, pagka, kapag, pag
RBP Causative (Pananhi) dahil sa, dahil dito, kaya
RBB Benefactive (Benepaktibo) para sa, para kay
RBR Referential (Pangkaukulan) tungkol sa, ukol, hinggil, patungkol, ayon sa, ukol sa,hinggil sa, alinsunod sa, sabi ni, wika ni, tanong ni
RBQ Question (Pananong) bakit, paano, baga, kaya, gaano
RBT Agree (Panang-ayon) talaga, oo, tunay, mangyari, opo, oho, siyanga pala,sadya, maaaring, totoo
RBF Disagree (Pananggi) hindi nga, hinding-hindi, walang, huwag, ewan,aywan, ayaw, malay, wag, ayoko
RBW Frequency (Pamanahon) tuwing, muli, ngayon, laging, pagkatapos, noon,mamaya, parati, bihira, bago, uli, sandali, minsan,samantala, habang, kapag, buhat, mula ng, umpisa,hanggang, kahapon, kanina, bukas, araw-araw, galing
RBM Possibility (Pang-agam) baka, tila, marahil, yata, siguro, wari, malamang,maaaring
RBL Place (Panlunan) kina Thelma, nasa, sa + bahay, amin, ilalim, likod,itaas, harap, mula sa, kinaroroonan, tungo sa
RBI Enclitics (Paningit) na, pa, rin, din, man, muna, kaya, naman, sana, yata,ba, nga, daw, raw, kasi, lang, lamang, pala, tuloy
RBJ Interjections (Sambitla) hoy, aba, ay, aray, naku, ha
RBS Social Formula (Pormularyong Panlipunan) Tao po!, Magandang umaga! Mano po., Salamat po., Pasensya na po., Sori po.
CD.* Cardinal Number (Bilang)
CDB Digit, Rank, Count 1, una, tatlo, II
TS Topicless (Walang Paksa) Umuulan., Alas dos na., May tao., Ang tapang mo pala.
FW Foreign Words English, Spanish, Latin
PM.* Punctuation (Pananda)
PMP Period “.”
PME Exclamation Point “!”
PMQ Question Mark “?”
PMC Comma “,”
PMSC Semi-colon “;”
PMS Symbols “@, /, +, *, ( , ), “,’, ~, &, %, $, #, =, -, :”
Compound Tags __…_
CCB_CCP JJCS_VBRF_CCP PRI_CCT RBL_JJD VBS_CCP
CCR_CCA JJCS_VBTR PRI_LM RBL_JJD_CCP VBTF_CCP
CCR_CCB JJCS_VBTR_VBOF PRL_CCP RBL_NNC VBTF_JJD
CCR_CCP JJCS_VBTR_VBRF PRL_LM RBL_NNP VBTF_VBAF
CCR_LM JJCS_VBTS PRO_CCB RBL_NNPA VBTF_VBOB
CCT_CCA JJCS_VBW PRO_CCP RBL_NNP_NNP VBTF_VBOF
CCT_CCP JJC_CCB PRO_LM RBL_PRL VBTF_VBOF_CCP
CCT_LM JJC_CCP PRP_CCB RBM_CCP VBTR_CCP
CCU_DTP JJC_JJD PRP_CCP RBM_LM VBTR_VBAF
CDB_CCA JJC_PRL PRP_LM RBN_CCP VBTR_VBAF_CCP
CDB_CCP JJD_CCA PRQ_CCP RBP_CCP VBTR_VBOB
CDB_LM JJD_CCB PRQ_LM RBQ_CCB VBTR_VBOF
CDB_NNC JJD_CCP PRSP_CCP RBQ_CCP VBTR_VBOF_CCP
CDB_NNC_CCP JJD_CCT PRS_CCB RBQ_LM VBTR_VBRF
JJCC_CCP JJD_NNC PRS_CCP RBR_DTP VBTR_VBRF_CCP
JJCC_JJD JJD_NNP PRS_LM RBS_CCP VBTS_CCA
JJCN_CCP JJN_CCA RBD_CCB RBT_CCB VBTS_CCP
JJCN_LM JJN_CCB RBD_CCP RBT_CCP VBTS_JJD
JJCS_CCB JJN_CCP RBD_LM RBT_CCT VBTS_LM
JJCS_CCP JJN_NNC RBF_CCP RBT_LM VBTS_VBAF
JJCS_JJC JJN_NNC_CCP RBF_JJD RBW_CCA VBTS_VBOB
JJCS_JJC_CCP NNC_CCA RBF_JJD_CCP RBW_CCB VBTS_VBOF
JJCS_JJD NNC_CCB RBF_LM RBW_CCP VBTS_VBOF_CCP
JJCS_JJD_CCB NNC_CCP RBF_RBW RBW_DTP VBTS_VBOL
JJCS_JJD_CCP NNC_LM RBF_VBTR RBW_LM VBTS_VBRF
JJCS_JJD_NNC NNC_PMC RBF_VBW_CCP RBW_RBI VBW_CCB
JJCS_JJN NNP_CCA RBI_CCA VBAF_CCP VBW_CCP
JJCS_JJN_CCP NNP_CCP RBI_CCP VBH_CCB VBW_CDB
JJCS_RBF PRC_CCB RBI_LM VBH_CCP VBW_LM
JJCS_VBAF PRC_CCP RBJ_CCP VBN_CCP
JJCS_VBAF_CCP PRI_CCB RBK_LM VBOB_CCP
JJCS_VBN_CCP PRI_CCP RBL_CCP VBOF_CCP
JJCS_VBOF PRI_CCP_NNP RBL_CCP_NNP VBRF_CCP

Main differences to default MGNN tagset


The list of PoS tags with their descriptions and examples was gained from https://drive.google.com/file/d/0B7lapk7DR3X4cHF5M3gxM1pCdGs/view

Bibliography

Tagset used at Nocon, N. and Borra, A.’s “SMTPOST: Using Statistical Machine Translation Approach
in Filipino Part-of-Speech Tagging” (2016) from De La Salle University, Manila, Philippines.(https://www.aclweb.org/anthology/Y/Y16/Y16-3010.pdf)

Tagalog tlTenTen corpus

Sketch Engine provides access to a multi-million Tagalog corpus of texts from the web.

or