Keyword and term extraction

Keywords and terms are word and phrases typical for your corpus because they appear in your corpus more frequently than they would in general language. They can be used to define or understand the main topic of the corpus. Sketch Engine combines statistics with linguistic criteria to extract keywords and terms.

Keywords

Keywords are single words (tokens) which appear in the focus corpus more frequently than they would in general language. The general language is represented by the reference corpus.

Terms

Terms are multi-word units (phrases) that fulfil two conditions:
(1) they appear in the focus corpus more frequently than they would in the general language (or in a reference corpus)
(2) they have a structure allowed for terms in the language (set in a term grammar)

How to use keywords and terms

Term extraction, generally, only makes sense with user corpora. If you do not have one, you can have Sketch Engine quickly create one for you.

Log in and select a corpus. On the dashboard, click KEYWORDS & TERMS or use the same button on the main menu (1). The procedure will start automatically. The extraction can take several minutes for very large corpora.

Use the context menu (2) to view the keyword or term in context in both focus and reference corpora.

Use the toolbar (3) to:

  • change the criteria of the extraction (for expert users only)
  • download the results
  • change how you view the results and what is displayed
  • view extraction criteria
  • add this result to your favourites for easy access next time.

Try OneClick Terms, a simple term extraction interface powerd by the sophisticated Sketch Engine technology.