cooccurrence [ text-analysis ]cooccurrence or co-occurrence is a term which expresses how often two terms from a corpus occur alongside each other in a certain order. It usually indicates words which together create a new meaning. We call them as phraseme or multi-word expression, e.g. black sheep or get on. Sketch Engine help to find such words with using the word sketch tool or the collocation search. Read more about further tools for text analysis.
text analysis [ text-analysis ]text analysis (also content analysis or text analytics) is a method for analyzing (usually unstructured) text in order to extract information. The result of the text analysis is structured data. In addition to the traditional tools, Sketch Engine also offers some unique features. The traditional tools consist of various frequency-based statistics:
- word or lemma frequency, part-of-speech frequency via the wordlist tool
- bigram, trigram, n-gram frequencies via the n-gram tool
- absolute frequencies, relative frequencies, document frequencies, average reduced frequency (AFR)
- phrase and multiword frequency via the concordance
- high-quality keyword and term extraction enhanced with reference data and suitable for topic modelling
- co-occurrence analysis using linguistic criteria via the word sketch
text mining [ text-analysis ]text mining is an automatic process of extracting information from text, such as keywords of a text or its source(s). The corresponding tools in Sketch Engine are WebBootCaT for creating corpora from the web or keywords and terms extraction which finds terminology in your texts. Read about other text analysis tools.
web mining [ text-analysis ]web mining is the application of data mining which extracts information from texts. The web mining is focused on gaining information and metadata from the web. For this task, Sketch Engine uses the fully-automated tool WebBootCaT for creating corpora from the web which stores also metadata of processed websites. Read about other text analysis tools.