term grammar

A term grammar is a set of rules written in CQL which define the lexical structures, typically noun phrases, which should be included in term extraction. The lexical structures are defined using POS tags. The use of a term grammar ensures a clean term extraction result which requires very little post editing.

For illustration only: The term grammar for English defines terms as sequences of nouns and adjectives (noun+noun+noun, adjective+noun, adjective+adjective+noun etc.) The term grammar for Spanish contains rules such as noun+adjective, noun+de+noun, adjective+noun+de+noun etc. The actual rules are much more complex and allow articles, optional tokens. They also check adjective-noun agreement in number, gender or case and other relevant grammatical categories.

see also



Best term extraction (blog)

word sketch grammar