The lemma is the form of the word found in dictionaries, sometimes called the base form. Introducing lemmas makes it possible to treat different word forms of the word as the same word. This is especially useful with morphologically rich languages, i.e. languages where lemmas can have many different word forms (Spanish, French, Polish, Japanese, Turkish, Russian etc.).
The existence of the lemma makes it possible to type go and find go, goes, going, gone and went automatically. A wordlist generated on the lemma attribute will count the frequencies of go, goes, going, gone and went together and display them as one item: go. To find their individual frequencies, the word or lc attribute should be used.
The lemma preserves the original capitalization but, typically, the first word of a sentence will be lowercased.
In most languages (German is one of the exceptions), when a capitalized word is found in the middle of the sentence, the lemmatizer identifies it as unusual usage, possibly a brand name or proper noun, and will assign a lemma which is identical to the word form. Compare islands ⇢ island but Islands ⇢ Islands.
see also lemma