n-gram

is a sequence of a number of items (bigram = 2 items , trigram = 3 items …n-gram = n items). An item can refer to anything (letter, digit, syllable, token, word or others), . In the context of corpora and corpus linguistics, ngrams typically refer to tokens (or words). In linguistics, ngrams are sometimes referred to as MWEs, i.e. multiword expressions. Generating a list of the most frequent n-grams will help us linguistic phenomena that might go unnoticed when using other tools. Ngrams can identify discourse markers or chunks of language which should be taught/learnt as fixed phrases in leanguage teaching.

The toold to generate ngrams is the N-gram tool in Sketch Engine.