reference corpus

reference corpus is used in keyword extraction and term extraction. A reference corpus is a corpus to which the focus corpus is compared.

When using the Keywords & Terms tool, a reference corpus is preselected but the user can use a different corpus as a reference corpora.  The reference corpus can but does not have to be the same for keywords and for terms.

A reference corpus can also be used with n-grams to identify n-grams typical of the focus corpus in comparison with the reference corpus.

In the word sketch, a reference corpus is used to identify key collocations when using the AS A LIST option in the word sketch.


see also



term extraction (definition)

term extraction (quick start guide)