The latest data until  September 2017 have been added to The Timestamped JSI web corpora. Data in all 18 languages are updated with new data monthly and the smaller parts of corpora daily.

The English corpus grows by ca 1 billion words each month and it reached the size of 25 billion words. Arabic, German, French, Italian, Portuguese, Russian, and Spanish timestamped corpora all contain over 1 billion words.

The corpora are excellent general use corpora with the added value of time stamps which facilitated the implementation of the trends feature to automatically identify new words or words whose use changes in time (trending words).


All 18 corpora are available to both trial and paying subscribers.

CQL builder for corpus quieries

CQL builder

Quick Tip