• CAT tool

    A CAT tool stands for a computer assisted translation tool. It is software that helps translators maintain consistency in terminology across their translation jobs and also aids the translation process by suggesting (or translating automatically) passages (segments) which the translator already translated in the past. (more…)
  • cluster

    a process of creating groups of words in the thesaurus or word sketch. Words are connected to their shared collocational behavior. See more on the Clustering Neighbours documentation
  • collocate

    a part of a collocation that is not the node. A collocate is dependent on the node. The collocate strong and the node wind make up the collocation strong wind
    collocation
    collocate node
    strong wind
    icy wind
    cold wind
    The most typical collocates for every word in the language can be generated with the word sketch tool.
  • collocation

    a collocation is a sequence or combination of words that occur together more often than would be expected by chance (from Wikipedia|Collocation) A collocation, e.g. fatal error, typically consists of a node (error) and a collocate (fatal). (more…)
  • comparable corpus [ corpus-types ]

    A comparable corpus is a corpus consisting of texts from the same domain in more languages. In contrast to a parallel corpus, the texts are not translations of each other and belong to the same domain with the same metadata. An example of a comparable corpus is corpus made from Wikipedia.
  • compile

    A corpus compilation refers to the processing of the corpus data (text) with the tools available for the language and converting the text into a corpus.Only a compiled corpus can be searched. see corpus compilation
  • concordance [ feature ]

    a list of all examples of the search word or phrase found in a corpus, usually in the format of a KWIC concordance with the search word highlighted in the centre of the screen and some context to the right and to the left see also KWIC
  • concordancer [ feature ]

    A concordancer is a tool (a piece of software) which searches a text corpus and displays a concordance. (more…)
  • CoNLL format

    CoNLL format is a specific format of the vertical file that represents a syntactic parse tree. (more…)
  • cooccurrence [ text-analysis ]

    cooccurrence or co-occurrence is a term which expresses how often two terms from a corpus occur alongside each other in a certain order. It usually indicates words which together create a new meaning. We call them as phraseme or multi-word expression, e.g. black sheep or get on. Sketch Engine help to find such words with using the word sketch tool or the collocation search. Read more about further tools for text analysis.
  • corpus

    A corpus is a large collection of authentic texts used for studying language or generating linguistic data. Modern corpora contain texts whose total length is billions or dozens of billions of words. A corpus is usually annotated (=word are labelled with information about the part of speech and grammatical category). The terms corpus and text corpus and language corpus are interchangeable. Using a corpus for any type of linguistic or language oriented work ensures that the outcomes reflect the real use of the language. more on copora»
  • corpus architect

    an intuitive tool inside Sketch Engine for creating corpora from documents or the Web which does not require any expert knowledge. See the create your own corpus    page.
  • corpus manager

    a program used to manage text corpora, i.e. to build, edit, annotate and search corpora. Sketch Engine is the user interface to the corpus manager Manatee.
  • CQL

    The Corpus Query Language is a code used to set criteria for complex searches which cannot be carried out using the standard user interface controls. The criteria may include words or lemmas but also tags and other attributes, text types or structures. Conditions can be set for optional tokens or token repetition.
  • CSV

    a type of plain text document used for saving tabular data. It is seamlessly accepted by a large variety of applications and is therefore ideal for exporting Sketch Engine results to be used in other software. CSV can be opened directly in Microsoft Excel, Open Office, Google Documents and many others.