The Spoken British National Corpus 2014 is a contemporary British English corpus made up of spoken British English in the 21st century. It will be part of BNC2014 (not published yet). The Spoken BNC2014 corpus contains transcripts of recorded conversations, gathered from the UK public between 2012 and 2016. The conversations were recorded in informal settings and took place among friends and family members. The spoken BNC2014 is composed of 1,251 conversations of 672 speakers. The total size of British National Corpus 2014 spoken is 10.4 million words.
The corpus has rich metadata (processed into text types in Sketch Engine) which can be used to restrict the anylysis for for calculating statistics.
The Spoken BNC2014 was developed in collaboration between Lancaster University and Cambridge University Press. More information can be found on the official website http://corpora.lancs.ac.uk/bnc2014/ and in Love, R., Dembry, C., Hardie, A., Brezina, V., & McEnery, T. (2017). The spoken BNC2014. International Journal of Corpus Linguistics, 22(3), 319-344.
Unlike the corpus on the official website, the Spoken British National Corpus 2014 in Sketch Engine was tagged by TreeTagger using Penn TreeBank tagset with Sketch Engine modifications.