ALC: Arabic Learner corpus
The Arabic Learner corpus (ALC) is a language corpus made up of texts written and spoken texts that belong to learners of Arabic in Saudi Arabia. All texts were gained in the years 2012–2013 and include 282732 words of 942 students from 67 nationalities.
See more on the project site: http://www.arabiclearnercorpus.com/about-the-corpus-en
Texts were POS tagged using the Stanford parser with the following POS tagset description.
Tools to work with the Arabic ALC corpus
A complete set of Sketch Engine tools is available to work with this Arabic Learner Corpus to generate:
Alrabiah, M., Al-Salman, A., & Atwell, E. S. (2013). The design and construction of the 50 million words KSUCCA. In Proceedings of WACL’2 Second Workshop on Arabic Corpus Linguistics (pp. 5-8). The University of Leeds.
Search the corpus of classical Arabic
Sketch Engine offers a range of tools to work with the KSUCCA corpus.
Use Sketch Engine in minutes
Generating collocations, frequency lists, examples in contexts, n-grams or extracting terms is easy with Sketch Engine. Use our Quick Start Guide to learn it in minutes.