Compare and contrast words visually

Word sketch difference is used to compare and contrast two words by analysing their collocations and by displaying the collocates divided into categories based on grammatical relations.

Why use the Word Sketch?

The Word Sketch is a real timesaver, negating the need for reading thousands of concordance lines to draw conclusions. All information is readily available from one screen only.

In the screenshot below, Sketch Engine assigned the green colour to clever and the red colour to intelligent.  Green collocates are more closely related to clever, red collocates to intelligent. Stronger colour indicates stronger collocations, e.g. it is more usual to say clever trick than intelligent trick, on the other hand, it is more natural to say intelligent robot than clever robot. White colour means similarity. Generally, the colour of the collocates is determined by the difference between the scores for the collocate with these two compared words (e.g. clever/intelligent).


How to generate a Word Sketch Difference?

To generate a Word Sketch, access the Word Sketch screen from the left menu (1):

Compare collocations of two words based on text corpus data

(2) type the first lemma

(3) select the part of speech

Sketch diff by

lemma (4) type the second lemma

subcorpus (5) use this option to compare the behaviour of the same word in two different subcorpora, e.g. written or spoken language or academic and general language. Use the links to create new subcorpora. To use this option, only type one lemma in (2) and select the two subcorpora to compare.

word form (6) use this option to compare two-word forms of the same lemma; first type the lemma (2) and then the two-word forms

(7) click Show diff to display the results

Word Sketch Difference in detail

The result screen looks like this:

Word sketch difference result screen

(1) the header shows the name of the corpus and the frequency of each lemma

(2) the colour bar shows how the scores translate into colour

(3) name of the grammatical category, if the name is not clear, click one of the frequency counts (4) or (5) to see examples
(4) frequency count for the combination with the first lemma, i.e. funny and/or clever

(5) frequency count for the combination with the second lemma, i.e. funny and/or intelligent

(6) [used for development purposes only]

(7) [used for development purposes only]

The left menu of the word sketch result screen gives these options:

Change options

opens the advanced setting dialogue (described on this page) to change options

The advanced options are used to change the way the results are presented. The complete set of advanced options looks like this:

Compare collocations with advanced options

all in one block – the result screen will be displayed as in the screenshot above with red, white and green words for each grammatical relation in the same column

common/exclusive blocks – the result screen will first display the white words in categories, followed by the green words in categories followed by the red words in categories

(2) set a minimum frequency for the collocation to be included in the Word Sketch Difference, leave set to auto for Sketch Engine to decide automatically

(3) sets the number of items when collocates are displayed in one block (bigger number means longer lists with more items)

(4) sets the number of items displayed when collocates are dispalyed in exclusive blocks (biger number means longer lists with more items)

The statistics used in Sketch Engine to calculate word sketches is described in this document.

Detailed Sketch Engine manual

THOMAS, James Edward (2015). Discovering English with Sketch Engine (DESkE), chapter 9 Word Sketches, pp. 161–176.

Work on word sketch

Semantic Word Sketches (presentation). Diana McCarthy, Adam Kilgarriff, Miloš Jakubíček and Siva Reddy (2015). InCorpus Linguistics (CL2015).

Finding Multiwords of More Than Two Words. Adam Kilgarriff, Pavel Rychlý, Vojtěch Kovář and Vít Baisa (2012). In Proceedings of the 15th EURALEX International Congress, Norway, pp. 693–700.

A Quantitative Evaluation of Word Sketches. Adam Kilgarriff, Vojtěch Kovář, Simon Krek, Irena Srdanovic and Carole Tiberius (2010). In Proceedings of the 14th EURALEX International Congress. The Netherlands, pp. 372–379.

Towards disambiguation of word sketches. Vít Baisa (2010). In Text, Speech and Dialogue. Germany, Berlin: Springer-Verlag, pp. 37–42.

Word sketch for individual languages

The articles relating to individual languages can be found in the Bibliography section (e.g. Arabic, Chinese, Polish, Japanese, etc).