Text type analysis | Sketch Engine

Text type analysis – statistics of metadata in the corpus

The Text type analysis tool in Sketch Engine shows a breakdown by metadata. For example, you can see how many documents, tokens or words there are in the corpus in texts downloaded from each website, written by each author or published in each year. The available options depend on the metadata in the corpus.

The tool allows users to switch between sizes measured in structures, tokens or words.

The filter can diplay only values that start with, end with or contain certain characters.

The tool is available from the corpus dashboard or by using the shortcut g + a (Go to Text type Analysis).

This screenshot shows the statistics for websites. This metadata is attached to the structure, i.e. documents. The sizes are shown in words, and only websites containing ‘blog’ are shown.

The number of values in the chart can be controlled in the settings, and the pie chart can be downloaded.

The list of values (here websites) and their sizes contain local menus that can be used to quickly jump to the concordance related to the website.

back to Guide

Text type analysis – statistics of metadata in the corpus

for learners of languages

A Course in Lexicography and Lexical Computing

term extraction

learn sketch engine