• deduplication

    is a process during which repeating same texts are removed and the only first text of all same (duplicated) texts is kept. The deduplication process can be carried out at various levels, e.g. documents. It means that one whole document of two same ones will be removed.
  • disambiguation

    a process of identifying meanings of words (lemma, part of speech) when a word has multiple meanings. The result of this process is one word with one meaning.
  • distributional thesaurus [ feature ]

    an automatically produced thesaurus which identifies words that occur in similar contexts as the target word. It draws on the hypothesis of distributional semantics. The automatically produced thesaurus is available for each word in the corpus. more about automatic thesaurus The distributional thesaurus in Sketch Engine is available for every language and corpus that supports word sketches. Refer to user manual to learn to generate the thesaurus.