term grammar

A term grammar is a set of rules written in CQL which define the lexical structures, typically noun phrases, which should be included in term extraction. The term grammar uses POS tags. The use of a term grammar ensures a clean term extraction result which requires very little post editing.

For illustration only: The term grammar for English defines terms as sequences of nouns and adjectives (noun+noun+noun, adjective+noun, adjective+adjective+noun etc.) The term grammar for Spanish contains rules such as noun+adjective, noun+de+noun, adjective+noun+de+noun etc. The actual rules are much more complex and allow articles, optional tokens. They also check adjective-noun agreement in number, gender or case.

see also

term

keyword

Best term extraction (blog)