Arabic corpus of the Quran
The Quran annotated corpus is an Arabic corpus built up from the Quran, the central religious text of Islam. This Quran corpus version was prepared by Zainab Alqassem (Alqassem 2013). The data was taken from the Quranic Arabic Corpus (Dukes 2009) and the QurAna anaphoric coreference database (Sharaf and Atwell 2012). Corpus texts were lemmatized and POS tagged.
The morphological annotation in the corpus uses POS tagset specially created for the Quranic Arabic Corpus.
Tools to work with the Quranic Arabic corpus
A complete set of tools is available to work with this Arabic corpus to generate:
version 1 (7th May 2013)
- initial version
The Quran annotated corpus
Zainab, A. (2013). Unifying Quranic analyses into a single database. BSc Final Year Project Dissertation, School of Computing, University of Leeds.
The Quranic Arabic Corpus
Dukes, K. (2009). The Quranic Arabic Corpus. Leaman, Oliver.
QurAna: Corpus of the Quran annotated with Pronominal Anaphora
Sharaf, A. B. M., & Atwell, E. (2012). QurAna: Corpus of the Quran annotated with Pronominal Anaphora. In LREC (pp. 130-137).
Search the Quran corpus with annotation
Sketch Engine offers a range of tools to work with this Quranic Arabic Corpus.
Use Sketch Engine in minutes
Generate collocations, frequency lists, examples in contexts, n-grams or extract. Use our Quick Start Guide to learn it in minutes.