• web mining [ text-analysis ]

    web mining is the application of data mining which extracts information from texts. The web mining is focused on gaining information and metadata from the web. For this task, Sketch Engine uses the fully-automated tool WebBootCaT for creating corpora from the web which stores also metadata of processed websites. Read about other text analysis tools.
  • word form [ attribute ]

    A word form refers to one form that a word can take, e.g. the word go can take these word forms go, went, gone, goes, going. Searching for the word form going will not find any other forms of the word. It is case sensitiveapple and Apple are two different word forms.
  • word list

    A word list is a generic name for various types of lists such as list of words, lemmas, POS tags or other attributes with their frequency (hit counts, document counts or others).
  • word sketch

    A word sketch is a one-page, automatic, corpus-derived summary of a word’s grammatical and collocational behaviour. more»
  • Word Sketch grammar

    Word Sketch grammar (WSG) is a set of rules defining the grammatical relations (=columns/categories) in a Word Sketch. WSG is language dependent, the same WSG cannot be shared across languages. Different corpora in the same language can use the same or different WSG. Users can write their own WSG to match their specific need. Corpora in unsupported languages can make use of a universal WSG which provides only basic statistics of words surrounding the keywords ignoring the grammar of the language. The universal WSG can also be modified by the user. more»