UKWaC tagged with SuperSenseTagger (sst-light) described in Ciaramita and Altun (2006).
Attributes include the Penn TreeBank tags and SuperSenseTagger WordNet labels and Named Entity Labels. There are also named entity and MWE structures produced in the vertical from the sst output.
The Sketch Grammar is undergoing research and development.
v1.0 (March 2012)
- 370,023,634 tokens
Diana McCarthy, Adam Kilgarriff, Miloš Jakubíček and Siva Reddy. Semantic Word Sketches. In 8th International Corpus Linguistics Conference (CL 2015).
Ciaramita, Massimiliano, and Yasemin Altun. Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2006.