Corpus of Hebrew translation texts

The Hebrew translation corpus, also known as Hebrew Comparable Corpus is a language corpus made up of translated and non-translated texts of the Hebrew language. There are about fifteen books (fiction and non-fiction) in each component. The two components are matched for topic and genre: for example, there is one biography in each. It is best suited for people who want to study differences between translated and non-translated language. It can also be used in order to study language use more generally.

The corpus was compiled as part of a project funded by the Israel Science Foundation and carried out in the Department of Translation and Interpreting Studies at Bar Ilan University.

Part-of-speech tagset

The Hebrew translation corpus is tagged with the YAP part-of-speech tagset.

Tools to work with the Hebrew Translation Corpus

A complete set of Sketch Engine tools is available to work with this Hebrew Translation Corpus to generate:

  • keywords – terminology extraction of one-word units
  • word lists – lists of Hebrew nouns, verbs, adjectives etc. organized by frequency
  • n-grams – frequency list of multi-word units
  • concordance – examples in context
  • text type analysis – statistics of metadata in the corpus

Search the Hebrew Translation corpus

Sketch Engine offers a range of tools to work with the Hebrew Translation corpus.

Other text corpora in Sketch Engine

Sketch Engine offers 800+ language corpora.

Use Sketch Engine in minutes

Generating collocations, frequency lists, examples in contexts, n-grams or extracting terms is easy with Sketch Engine. Use our Quick Start Guide to learn it in minutes.