text analysis (also content analysis or text analytics) is a method for analyzing (usually unstructured) text in order to extract information. The result of the text analysis is structured data. In addition to the traditional tools, Sketch Engine also offers some unique features. The traditional tools consist of various frequency-based statistics:
- word or lemma frequency, part-of-speech frequency via the wordlist tool
- bigram, trigram, n-gram frequencies via the n-gram tool
- absolute frequencies, relative frequencies, document frequencies, average reduced frequency (AFR)
- phrase and multiword frequency via the concordance
Advanced techniques include:
- high-quality keyword and term extraction enhanced with reference data and suitable for topic modelling
- co-occurrence analysis using linguistic criteria via the word sketch
The tools and statistics can be combined depending on the task involved.
See also other text analysis tools.