Annotating a vertical file

Token annotation

For most languages, Sketch Engine tags and lemmatizes tokens in user corpora automatically. If the tools are not available for a language, the user can supply tags or lemmas manually. Automatically generated tags and lemmas can also be corrected manually.

Procedure in a nutshell

(If your corpus is in Sketch Engine, first download it in vertical format.)

Open the corpus in a plain text editor or annotation software.
Add metadata to tokens and save the file.
Upload it to Sketch Engine where metadata will be processed into token attributes automatically.

Example

Tokens annotated with these attributes:

word
lowercase lemma
part of speech

Golden	golden	adjective
gate	gate	noun
bridge	bridge	noun

Structure annotation and token annotation

If the corpus contains metadata for structures, each structure will be listed on a separate line together with the related metadata like this:

<named_entity type="geography" subtype="bridge">
Golden	golden	adjective
gate	gate	noun
bridge	bridge	noun
</named_entity>

back to annotating corpus

Token annotation

Procedure in a nutshell

Example

Structure annotation and token annotation

for learners of languages

A Course in Lexicography and Lexical Computing

term extraction

learn sketch engine

Annotate a vertical file

Token annotation

Procedure in a nutshell

Example

Structure annotation and token annotation

for learners of languages

A Course in Lexicography and Lexical Computing

term extraction

learn sketch engine