Australian Legislative Corpus 2023 (ALC23)

This corpus includes available and machine-readable acts, regulations, and rules across the nine major jurisdictions in Australia. This includes relevant in force legislation as at 30 June 2023. The ALC23 is able to be utilised as a whole, or as a variety of subcorpora. The subcorpora can be split via jurisdiction, and/or according to acts and subordinate legislation.

This corpus is published in Sketch Engine thanks to Emma Genovese from the University of Technology in Sydney.

The ALC23 is available as an “ondemand” corpus. You simply look up the corpus in Sketch Engine, then you will be prompted to agree to terms and conditions before being able to use the corpus.

Part-of-speech tagset and lemmatization

The English Web corpora are part-of-speech tagged with the following English Penn Treebank tagset summary (with Sketch Engine modifications) indicating the part of speech and grammatical category. The corpus texts also contain lemmatization when each word form from the corpus is assigned to its base form (lemma).

Australian Legislative Corpus 2023 corpus sizes

Frequency
Tokens 186,983,925
Words 138,411,932
Sentences 2,428,829
Web pages 10,478

Search the English corpus Australian Legislative Corpus 2023

Sketch Engine offers a range of tools to work with this English corpus.

Subcorpora

As mentioned above, the corpus contains subcorpora, which are split via jurisdiction, and/or according to acts and subordinate legislation.

Hover over the chart to display a number of tokens of the particular subcorpus.

Tools to work with the Australian Legislative Corpus 2023 from the web

A complete set of Sketch Engine tools is available to work with this English corpus to generate:

  • word sketch – English collocations categorized by grammatical relations
  • thesaurus – synonyms and similar words for every word
  • keywords – terminology extraction of one-word and multi-word units
  • word lists – lists of English nouns, verbs, adjectives etc. organized by frequency
  • n-grams – frequency list of multi-word units
  • concordance – examples in context
  • text type analysis – statistics of metadata in the corpus

Australian Legislative Corpus 2023

  • published in October 2024
  • available in Sketch Engine for paying users

Emma Genovese: Australian Legislative Corpus 2023

COPYRIGHT

© State of Victoria, Australia Copyright in all legislation of the Parliament of the State of Victoria, Australia, is owned by the Crown in right of the State of Victoria, Australia.

For texts associated with the Australian Capital Territory, the ACT (through the ACT Parliamentary Counsel’s Office) will retain copyright.

For the Northern Territory, permission to publish or deal with any legislation of the Northern Territory is granted in certain circumstances, but may be revoked, varied, or withdrawn on reasonable notice.

For Tasmania, South Australia, Western Australia, Queensland, New South Wales and the Commonwealth, all content accessed on legislative websites are provided under the Creative Commons Attribution 4.0 International license. The Creative Commons Attribution 4.0 International licence can be accessed via this link: https://creativecommons.org/licenses/by/4.0/. Changes made to any legislation include those which occur automatically by Sketch Engine’s programs to convert the legislation into readable .txt files.

DISCLAIMER

For texts associated with Victoria, this information may only be used for academic/research purposes. This product or service contains an unofficial version of the legislation of the Parliament of State of Victoria. The State of Victoria accepts no responsibility for the accuracy and completeness of any legislation contained in this product or provided through this service. For the latest information on Victorian Government legislation please go to https://www.legislation.vic.gov.au/.

The NTCor23 contains an unofficial version of the legislation of the Parliament of the Northern Territory. For the latest information on Northern Territory Government legislation please go to https://legislation.nt.gov.au/.

The TasCor23 is sourced from the Tasmanian Legislation website at 29 June 2023. For the latest information on Tasmanian Government legislation please go to www.legislation.tas.gov.au.

The SACor23 is sourced from content from the South Australian Legislation website at 2 and 4 July 2023. For the latest in-formation on South Australian Government legislation, please go to www.legislation.sa.gov.au/.

The WACor23 is sourced from content from the Western Australian Legislation website at 30 June 2023. For the latest information on Western Australian Government legislation, visit www.legislation.wa.gov.au.

The QLDCor23 is sourced from content from the Queensland Legislation website at 30 June 2023. For the latest information on Queensland Government legislation please go to https://www.legislation.qld.gov.au.

The NSWCor23 is sourced from content from the New South Wales Legislation website at 7 July 2023. For the latest information on New South Wales Government legislation please go to https://www.legislation.nsw.gov.au/.

The CthCor23 is sourced from content from the Federal Register of Legislation at 5–6 July 2023. For the latest information on Australian Government legislation please go to https://www.legislation.gov.au/.

Other English corpora

Explore our largest Timestamped English corpus with 70+ billion words.

Use Sketch Engine in minutes

Generate collocations, frequency lists, examples in contexts, n-grams or extract terms. Use our Quick Start Guide to learn it in minutes.