Glossary

You are here: Home1 / User Guide2 / Glossary

Search:

(clear)

Attribute
An attribute can refer to:
- A positional attribute - information added to each token in a corpus, e.g. its lemma or part of speech. more»
- A structure attribute - information added to a structure in a corpus, often called metadata
gender lemmaThe gender lemma is an attribute used in connection with term extraction. Its purpose is to display terminology in the correct word form in languages which observe the agreement in gender between adjectives and nouns. The standard lemma would produce a grammatically unacceptable word form [...] Read More
lc
Learn to understand attributes
(also referred to as word_lc, word lowercase or word form lowercase) is a positional attribute assigned to of each token in the corpus. The lc attribute is a lowercased version of the word attribute: John becomes john, Apple becomes apple, BE becomes be. The [...] Read More
lemma
Learn to understand attributes
Lemma is a positional attribute. It is the basic form of a word, typically the form found in dictionaries. A lemmatized corpus allows for searching for the basic form and include all forms of the word in the result, e.g. searching for lemma go will find go, [...] Read More
lemma_lc
Learn to understand attributes
lemma_lc is a positional attribute. It is a lemma converted to lowercase. apple and Apple are treated as the same thing. It is used for case insensitive searching and case insensitive analysis. see lemma
lempos
Learn to understand attributes
Lempos is a positional attribute, i.e. an attribute assigned to each token in the corpus. It is a combination of lemma and part of speech (pos) consisting of the lemma, hyphen and a one-letter abbreviation of the part of speech, eg. go-v, house-n. The [...] Read More
lempos_lc
Learn to understand attributes
lempos_lc is a positional attribute. It is a lowercased version of lempos. All uppercase letters are converted to lowercase, thus House-n becomes identical with house-n. It is used for case insensitive searching and analysis. see also lempos list [...] Read More
longtagLongtag is a detailed part-of-speech tag which usually contains more information than tag. Some corpora have tags containing only basic information on parts of speech and also attribute longtags consist of detailed grammatical information such as case, number, gender, etc. The longtangs [...] Read More
POS tagA POS tag (also part-of-speech tag) is the same as tag. Do not mistake for POS, the simplified POS tag showing only the part-of-speech information but not the additional morphological and grammatical information. See also positional attributes lempos lemma
positional attribute
Learn to understand attributes
A positional attribute is information added to each token in a corpus, typically its lemma or tag. Attributes differ between languages and, occasionally, even between corpora in the same language. Here are some examples of attributes:

word lemma tag
Read More
stemA stem is a part of a word without its affixes (suffixes, prefixes, etc.). Stems do not have to be valid word forms, e.g. stem hav for the word form having, in comparison to lemma have for the word form having. Stems are used instead of lemmas or in addition to lemmas with languages whose [...] Read More
tag(also called part-of-speech tag, POS tag or morphological tag) is a label assigned to each token in an annotated corpus to indicate the part of speech and often also grammatical categories and morphological information. The tool used to annotate a corpus is called a tagger. A collection of [...] Read More
word formⓘ This entry is for the positional attribute: word form, lemma, lowercase, tag… For the type of token, the opposite of nonword, see word.
Learn to understand attributes
The word form (often shortened to word in the interface) is a positional attribute. It refers to one of the word forms [...] Read More

for learners of languages

A Course in Lexicography and Lexical Computing

term extraction

learn sketch engine

X
LinkedIn
Facebook
Youtube
Rss

Contact
Jobs
Pricing
Privacy
Terms
Cookie settings

Scroll to top