The Europarl parallel corpus
The Europarl corpus is a parallel corpus created from the European Parliament Proceedings in the official languages of the EU. It includes 21 European languages: Romanic (French, Italian, Spanish, Portuguese, Romanian), Germanic (English, Dutch, German, Danish, Swedish), Slavik (Bulgarian, Czech, Polish, Slovak, Slovene), Finni-Ugric (Finnish, Hungarian, Estonian), Baltic (Latvian, Lithuanian), and Greek. The corpus was repeatedly expanded with the final size around 60 million words per language. Texts are from the period January 2007 – November 2011.
Most languages of the Europarl corpus were processed with the TreeTagger tool and thus there are available lemmas and part-of-speech tags in corpora.
Corpus data and more information can be found on the official website http://www.statmt.org/europarl/
Tools to work with Europarl corpora
A complete set of Sketch Engine tools is available to work with the Europarl corpora to generate:
- word sketch – collocations categorized by grammatical relations (requires POS tagging)
- thesaurus – synonyms and similar words for every word (requires POS tagging)
- word lists – lists of nouns, verbs, adjectives etc. organized by frequency
- n-grams – frequency list of multi-word units
- concordance – examples in context
version 7 TreeTagger (spring 2015)
- corpus tagged by TreeTagger
version 7.0 (May 2012)
- A further expanded and improved version of the corpus was released on 15th May 2012.
version 5.0 (May 2010)
- A corpus further expanded and improved version of the earlier version was released on 20th January 2010.
Koehn, P. (2005, September). Europarl: A parallel corpus for statistical machine translation. In MT summit (Vol. 5, pp. 79-86).
Search the Europarl corpus
Sketch Engine offers a range of tools to work with the Europarl corpus.
Create your own multilingual or parallel corpora in Sketch Engine.
See our user guide.
Use Sketch Engine in minutes
Generating collocations, frequency lists, examples in contexts, n-grams or extracting terms is easy with Sketch Engine. Use our Quick Start Guide to learn it in minutes.