BLaRC: British Law Report Corpus

British Law English corpus is the 6-million-word English corpus of judicial decisions. Explore legal English with this corpus.

The British Law Report Corpus (BLaRC) is a British English corpus made up of judicial decisions issued by British courts and tribunals. The corpus consists of 8.5 million words of legal texts published in 2008–2010.

Part-of-speech tagset

The BLaRC corpus is tagged using English PennTreebank Tagset.

Tools to work with the British Law Reference Corpus

A complete set of tools is available to work with this BLaRC corpus of legal documents to generate:

  • word sketch – English collocations categorized by grammatical relations
  • thesaurus – synonyms and similar words for every word
  • keywords – terminology extraction of one-word and multi-word units
  • word lists – lists of English nouns, verbs, adjectives etc. organized by frequency
  • n-grams – frequency list of multi-word units
  • concordance – examples in context
  • text type analysis – statistics of metadata in the corpus

RIZZO, Camino Rea; PÉREZ, Mª José Marín. Structure and design of the British Law Report Corpus (BLRC): a legal corpus of judicial decisions from the UK. Journal of English Studies, 2012, 10: 131-145.

Search the BLaRC corpus

Sketch Engine offers a range of tools to work with this British Law Reference Corpus.

Other text corpora

Sketch Engine offers 800+ language corpora.

Use Sketch Engine in minutes

Generate collocations, frequency lists, examples in contexts, n-grams or extract terms. Use our Quick Start Guide to learn it in minutes.