Susanne corpus: a subset of the Brown Corpus

The Susanne corpus is a subset of the Brown Corpus of American English annotated using the special annotation scheme which represents all aspects of English grammar which are sufficiently definite to be susceptible of formal annotation.

Detailed information can be found on https://www.grsampson.net/SueDoc.html

Part-of-speech tagset

The Susanne corpus was tagged by TreeTagger using the special Susanne pos tagset.

Tools to work with the Susanne corpus

A complete set of tools is available to work with this English corpus to generate:

  • word sketch – English collocations categorized by grammatical relations
  • thesaurus – synonyms and similar words for every word
  • word lists – lists of English nouns, verbs, adjectives etc. organized by frequency
  • n-grams – frequency list of multi-word units
  • concordance – examples in context

Bibliography

Geoffrey Sampson (Ed.). 1995. Susanne Corpus School of Cognitive & Computing Sciences, University of Sussex.

Search the Susanne corpus

Sketch Engine offers a range of tools to work with this English corpus.

or

Other text corpora

Sketch Engine offers 400+ language corpora.

Use Sketch Engine in minutes

Generate collocations, frequency lists, examples in contexts, n-grams or extract terms. Use our Quick Start Guide to learn it in minutes.