The British Academic Spoken English (BASE) corpus is freely available to researchers who agree to the following conditions:
- Corpus holdings should not be reproduced in full for a wider audience/readership (ie for publication or for teaching purposes), although researchers are free to quote short passages of text up to 100 running words, with a total of 200 running words from any given assignment.
- No part of the corpus holdings should be reproduced in teaching materials intended for publication (in print or via the internet).
- The corpus developers should be informed of all presentations and publications arising from analysis of the corpus.
Researchers must acknowledge their use of the BASE corpus using the following form of words: The recordings and transcriptions used in this study come from the British Academic Spoken English (BASE) corpus. The corpus was developed at the Universities of Warwick and Reading under the directorship of Hilary Nesi and Paul Thompson. Corpus development was assisted by funding from BALEAP, EURALEX, the British Academy and the Arts and Humanities Research Council.
Files should be referred to by their letter and number codes, indicating disciplinary grouping (e.g. ah = arts and humanities), type of speech event (e.g. lct = lecture) and file number.
Nesi, H. and H. Basturkmen (2006) ‘Lexical bundles and discourse signalling in academic lectures’. International Journal of Corpus Linguistics 11(3) 147-168
Thompson, P. (2006) ‘A corpus perspective on the lexis of lectures, with a focus on Economics lectures’. In K. Hyland and M. Bondi (eds) Academic Discourse Across Disciplines Bern: Peter Lang, pp. 253-270
Nesi, H. (2002) ‘An English spoken academic word list’ , in Braasch, A. and Provlsen, C. (eds) Proceedings of the Tenth EURALEX International Congress, Copenhagen: Center for Sprogteknologi
Nesi, H. (2001) ‘A corpus based analysis of academic lectures across disciplines’, in: Cotterill, J. and Ife A. (eds) Language Across Boundaries, London: Continuum Press