• n-gram

    is a sequence of a number of structures (bigram = 2 structures, trigram = 3 structures...n-gram = n structures) typically letters or words but also phonemes or syllables. Generating a frequency list of such sequences can help us notice which structures tend to combine in a language. n-grams are generated using the word list feature.
  • node

    (collocation) central word in a collocation, e.g. strong wind consists of the collocate strong and the node wind (concordance) the search word or phrase, sometimes called a query, appears in the centre of a KWIC concordance or highlighted in other types of concordances
  • non-word

    Non-words (also spelt nonwords) are tokens which do not start with a letter of the alphabet. Examples of non-words are numbers, punctuation but also tokens such as 25-hour, 16-year-old, !mportant, 3D. Tokens such as post-1945, mp3 or CO2 are normal words because they start with a letter. (There might be rare cases when the corpus author used a different definition in their corpus. The definition is part of the corpus configuration file.)