A term is a multi-word expression (consisting of several tokens) which appears more frequently in one corpus (focus corpus) compared to another corpus (reference corpus) and, at the same time, the expression has a format of a term in the language. The format is defined in a term grammar which is specific for each language. The term grammar typically focusses on identifying noun phrases.

The extracted terms are typical of the content of the corpus and can be used to identify the topic of the corpus.

also see

term extraction»