Longest-commonest match

The longest-commonest match (LCM) was coined by Adam Kilgarriff to name the most common realisation of a collocation, i.e. the chunk of language in which the collocation appears most frequently. The longest-commonest match is part of the word sketch result screen to facilitate the understanding of how the collocation typically behaves.

The longest-commonest match can reveal some important phraseological or idiomatic behaviour of the collocation. For example, cat frequently cooccurs with predator. The LCM, i.e. the most frequent representation of this collocation, is cats are predators which indicates that in general statements, plurals are preferred as opposed to A cat is a predator.

Similarly, the LCM for anybody and disagree is anybody who disagrees with suggesting that this construction is the preferred option an not the -ing form as in anybody disagreeing with.

related paper