Keyword and term extraction

Keywords and terms are word and phrases typical for your corpus because they appear in your corpus more frequently than they would in general language. They can be used to define or understand the main topic of the corpus. Sketch Engine combines statistics with linguistic criteria to extract keywords and terms.

Keywords

Keywords are single words (tokens) which appear in the focus corpus more frequently than they would in general language. The general language is represented by the reference corpus.

Terms

Terms are multi-word units (phrases) that fulfil two conditions:
(1) they appear in the focus corpus more frequently than they would in the general language (or in a reference corpus)
(2) they have a structure allowed for terms in the language (set in a term grammar)

How to use keywords and terms

Term extraction, generally, only makes sense with user corpora. If you do not have one, you can have Sketch Engine quickly create one for you.

Log in and select a corpus. On the dashboard, click KEYWORDS & TERMS or use the same button on the main menu. The procedure will start automatically. The extraction can take several minutes for very large corpora.

1
2
3
1

go to Keywords

2

local menu to display:

  • keyword in context (concordance)
  • keyword in the context of the reference corpus (concordance)
3

local menu to display:

  • change the criteria of the extraction (for expert users only)
  • download the results
  • change how you view the results and what is displayed
  • view extraction criteria
  • add this result to your favourites for easy access next time

Try OneClick Terms, a simple term extraction interface powerd by the sophisticated Sketch Engine technology.