Longest-commonest match

The longest-commonest match (LCM) was coined by Adam Kilgarriff to name the most common realisation of a collocation, i.e. the chunk of language in which the collocation appears most frequently. The longest-commonest match is part of the word sketch result screen to facilitate the understanding of how the collocation typically behaves.

The longest-commonest match can reveal some important phraseological or idiomatic behaviour of the collocation. For example, cat frequently co-occurs with predator. The LCM, i.e. the most frequent representation of this collocation, is cats are predators which indicates that in general statements, plurals are preferred as opposed to A cat is a predator.

Similarly, the LCM for anybody and disagree is anybody who disagrees with suggesting that this construction is the preferred option an not the -ing form as in anybody disagreeing with.

related paper