Corpus annotation and metadata
Each structure can, but does not have to, have additional labels giving more specific information about the structure. These are called meta data or structure attributes. For example, the structure might carry information about the year of publication, the genre, the dialect, the style, author, source, simply anything that the author wants to include. If the corpus author is unsure as to what to include, it is best to include all available metadata. If no metadata are present, the corpus can still be used for many tasks but metadata (information about the text) cannot be used in searches or analysis.
Meta data (or structure values) are automatically processed by Sketch Engine into text types, making it easy to set search criteria or build subcorpora based on text types.
An example of a corpus consisting of two documents, each document has two paragraphs and paragraph has two sentences with information about the year of publication of the whole document, the style of each paragraph.