CAEC: Cambridge Academic English Corpus
The Cambridge Academic English Corpus (CAEC) is an Academic English corpus made up of a sample of texts collected from a written and spoken academic language at undergraduate and post-graduate level from a range of US and UK institutions. The texts in this Academic English corpus are composed of lectures, seminars, student presentations, journals, essays and textbooks.
This Academi English corpus was tagged by TreeTagger using Penn TreeBank tagset with Sketch Engine modifications.
Tools to work with the Cambridge Academic English Corpus
A complete set of tools is available to work with this Academic English corpus to generate:
- word sketch – English collocations categorized by grammatical relations
- thesaurus – synonyms and similar words for every word
- word lists – lists of English nouns, verbs, adjectives etc. organized by frequency
- n-grams – frequency list of multi-word units
- concordance – examples in context
- keywords– terminology extraction of one-word units
- text type analysis – statistics of metadata in the corpus
Search the CAEC corpus
Sketch Engine offers a range of tools to work with this Cambridge Academic English Corpus.
Use Sketch Engine in minutes
Generate collocations, frequency lists, examples in contexts, n-grams or extract terms. Use our Quick Start Guide to learn it in minutes.