• metadata

    information about the texts in the corpus: for example, year of publication, author name, publishing house, medium (written, spoken), register (formal, informal) etc. Metadata are automatically converted to text types in Sketch Engine. see Annotate a corpus  
  • MI Score [ statistics ]

    The Mutual Information score expresses the extent to which words co-occur compared to the number of times they appear separately. MI Score is affected strongly by the frequency, low-frequency words tend to reach a high MI score which may be misleading. This is why Sketch Engine allows setting a frequency limit so that low-frequency words can be excluded from the calculation. In most cases T-score is more useful than MI score. see Concordance - Collocations see Statistics in Sketch Engine compare T-score
  • minimum sensitivity [ statistics ]

    a statistics measure similar to logDice which is the minimum of the two following numbers:

    • the number of co-occurrences divided by the frequency of the collocate
    • the number of co-occurrences divided by the frequency of the node word

    The minimum sensitivity number grows with a high number of co-occurrences and falls with a high number of occurrences of the individual words (node word or collocate).

  • multilevel list

    a list sorted at more than one level e.g. a frequency list sorted by word form followed by lemma and then tag, see this multilevel list in the BAWE corpus.