Corpus of British newspaper articles on Islam

This corpus is an English corpus made up of texts collected from British newspaper articles which are dedicated to the topic of Islam. The corpus was prepared by Dr Costas Gabrielatos in 2010. It consists of  140 million words. The corpus texts may serve for the analysis of differences in approach to the topic of Islam and Muslims in articles from tabloids and broadsheets.

Part-of-speech tagset

This English corpus of British news articles on Islam was tagged by Penn TreeBank tagset.


For gaining access to this corpus, please contact Dr Costas Gabrielatos at from Edge Hill University (previously at Lancaster University).

Corpus in detail

The chart shows the distribution of the parts of speech in the corpus of British news articles referring to Islam.

Basic information

Tokens 170,397,870
Words 140,922,329
Sentences 6,637,826
Web pages 1,568

Distribution of publication years of articles

Tools to work with the English corpora

A complete set of Sketch Engine tools is available to work with this English corpus a to generate:

  • word sketch – English collocations categorized by grammatical relations
  • thesaurus – synonyms and similar words for every word
  • word lists – lists of English nouns, verbs, adjectives etc. organized by frequency
  • n-grams – frequency list of multi-word units
  • concordance – examples in context

Baker, P. (2010) Representations of Islam in British broadsheet and tabloid newspapers 1999-2005. In Language and Politics, 9(2), pp. 310–338.

Search this English corpus

Sketch Engine offers a range of tools to work with the English corpus of British newspaper articles on Islam.


Other text corpora in Sketch Engine

Sketch Engine offers 500+ language corpora.

Use Sketch Engine in minutes

Generating collocations, frequency lists, examples in contexts, n-grams or extracting terms is easy with Sketch Engine. Use our Quick Start Guide to learn it in minutes.