term grammar

A term grammar is a set of rules written in CQL which define the lexical structures, typically noun phrases, which should be included in term extraction. The lexical structures are defined using POS tags and CQL. The use of a term grammar ensures a clean term extraction result which requires very little post editing.

For illustration only: The term grammar for English defines terms as sequences of nouns and adjectives (noun+noun+noun, adjective+noun, adjective+adjective+noun etc.) The term grammar for Spanish contains rules such as noun+adjective, noun+de+noun, adjective+noun+de+noun etc. The actual rules are much more complex and include prepositions and articles, optional words or define which words must or must not appear before or after a lexical structure for it to be considered a term. They also check adjective-noun agreement in number, gender or case and other relevant grammatical categories.

see also



Best term extraction (blog)

word sketch grammar