Gujarati web corpus (guWaC)
GuWac web as corpus is a corpus of Gujarati language (Indo-Aryan…
Patakis corpus
Patakis is a 100 million word collection of POS-tagged texts…
Georgian Web 2013 (kaWaC) corpus
Original file owner: bharat.
FinnishWaC corpus
Finnish web as corpus.
danishWaC corpus
The corpus prepared by Corpus factory method. It has 288 million…
Domain Specific Corpora
These corpora are prepared from specific domains, e.g. science,…
e-flux corpus
The e-flux corpus is a web corpus of English art news digests.…
Nineteenthcentury corpus
Actually, the 19th century corpus is only available to Osnabrück…