Data uploaded to Sketch Engine including personal information have always been safe. Now, this security is confirmed officially. Lexical Computing has been awarded the ISO27001 certification.
Author Archive for: michal
About Michal Cukr
This author has yet to write their bio.Meanwhile lets just say that we are proud Michal Cukr contributed a whooping 53 entries.
Entries by Michal Cukr
Sketch Engine calendar 2018 and its ninth month arrives with an example of using character classes in the regular expressions. This may be useful when you need to find (language independently) e.g. words starting with capital letters.
Sketch Engine calendar for August 2018 brings instructions on how to label particular tokens with using regular expressions. Learn to label tokens in your queries which enables you to add multiple conditions to one or more tokens.
In July 2018, our calendar explains how to find all tokens consisting of alphanumeric characters (a combination of case-insensitive letters a-z and numbers 0-9) using the regular expression \w.
We have improved our tools for processing corpora in Brazilian Portuguese and European Portuguese. Now our tools can recognise words from both Portuguese varieties. Moreover, we have made the word sketches for Portuguese better.
Check the next page from our Sketch Engine calendar 2018. This time, you will learn how to search inside a corpus structure using the Corpus query language operator within.
Are you looking for good German examples in context? Do you need German collocations or German thesaurus for your work? Our tool deSkELL, a free simplified interface of Sketch Engine, is the right choice for these types of tasks. Try it on https://deskell.sketchengine.co.uk/
Check our new 15-billion-word English corpus (enTenTen) comprised of texts from the Web until the end of 2015.
This blog post defines what POS tags are, explains manual and automatic tagging and points readers to Sketch Engine where they can have their texts tagged automatically in many languages. What is a POS tag? A POS tag (or part-of-speech tag) is a special label assigned to each token (word) in a text corpus to indicate […]
Are you ready for April Fools’ day this year? How about April Fools’ Day CQL? Download the April page from our calendar with an example of a punctuation search. The example does work! No joking. Sketch Engine is a serious tool after all.
After improving tools for processing Danish, we are coming with a new version of the Danish Corpus from the web. Texts in this 2-billion-word Danish corpus were downloaded in December 2017.
We are pleased to inform you that a list of Sketch Engine corpora has been extended by adding a new Belarusian corpus, the 63-million-word corpus of texts collected from the web.
Similarly to last year, we make the Sketch Engine calendar with useful CQL examples available online. Please download the page for March with handy examples of using an optional character and repetitions.
Find more and better collocations in French. We have improved our collocation search (the word sketch feature) identifying automatically collocations and patterns specific to French.
A new 25-million-word Amharic corpus has been added to Sketch Engine.