Cundeelee Wangka Stories parallel corpus

The Cundeelee Wangka Stories is a Cundeelee Wangka – English parallel corpus made up of texts provided by Goldfields Aboriginal Language Centre in Kalgoorlie, Australia. The corpus was created for the purpose of Lexicom workshop 2018. The Cundeelee Wangka language belongs to the aboriginal languages.

Part-of-speech tagset

The Cundeelee Wangka part of the corpus was tagged using Cundeelee Wangka POS tagset.

The English part of the corpus was tagged by TreeTagger using Penn TreeBank tagset with Sketch Engine modifications.

Access policy

To get access, please contact access Sue Hanson <susanhanson@y7mail.com> from Goldfields Aboriginal Language Centre in Kalgoorlie, Australia and provide a brief description of the purpose of your work. If you request will be accepted, please contact us at support@sketchengine.eu with including the confirmation from Sue Hanson and your username so that we could grant you access to this corpus.

Tools to work with the Cundeelee Wangka Stories corpus

A complete set of tools is available to work with this Cundeelee Wangka – English parallel corpus to generate:

keywords – terminology extraction of one-word and multi-word units
word lists – lists of nouns, verbs, adjectives etc. organized by frequency
n-grams – frequency list of multi-word units
concordance – examples in context
text type analysis – statistics of metadata in the corpus

Changelog

initial version (April 2018)

created corpus
part-of-speech tagged by researchers from Goldfields Aboriginal Language Centre

Search the Cundeelee Wangka Stories corpus

Sketch Engine offers a range of tools to work with this Cundeelee Wangka Stories corpus.

open in Sketch Engine

about Sketch Engine

Other text corpora

Sketch Engine offers 800+ language corpora.

available corpora

Use Sketch Engine in minutes

Generate collocations, frequency lists, examples in contexts, n-grams or extract terms. Use our Quick Start Guide to learn it in minutes.

Quick Start Guide