synonyms for all words in a language
By definition, a thesaurus (plural thesauri, pronounced [-rai]) is a type of dictionary which lists synonyms or words from the same semantic category, e.g. animals, furniture etc.
- beautiful – wonderful, stunning, amazing, good-looking…
- mango – grapefruit, pineapple, banana, papaya, melon…
Thesauri prove very useful to both native speakers as well as language learners when one needs to use a better-fitting word but cannot think of any. In addition, thesaurus databases can be exploited in text search software by allowing the user to automatically include similar words in the search.
Automatic synonym detection draws on the theory of distributional semantics which says, in a nutshell, that words that appear in the same context are similar in meaning. Sketch Engine already has a powerful feature of the word sketch to identify the typical word combinations, i.e. collocations, which play a key role in the computation of the automatically generated thesaurus.
When the user requests a thesaurus for the word beautiful, Sketch Engine will first generate a word sketch for the word which looks like this:
Then, a word sketch for every other adjective in the corpus will be generated. It looks like this for amazing:
thesaurus of beautiful based on the 20-billion-word English corpus enTenTen
both columns show good quality
thesaurus of beautiful based on the 96-million-word British National Corpus
less than 1 column of good quality results