ThaiWaC corpus
The corpus is prepared by Corpus factory method. Full details…
UKWaCsst corpus
UKWaC tagged with SuperSenseTagger (sst-light) described in…
Gujarati web corpus (guWaC)
GuWac web as corpus is a corpus of Gujarati language (Indo-Aryan…
Patakis corpus
Patakis is a 100 million word collection of POS-tagged texts…
FinnishWaC corpus
Finnish web as corpus.
Domain Specific Corpora
These corpora are prepared from specific domains, e.g. science,…
e-flux corpus
The e-flux corpus is a web corpus of English art news digests.…




