As a rule of thumb, do not worry about the advanced settings and use the default settings. Only if the results do not produce the results you need, start looking into the advanced settings.
You can repeat the same procedure several times to enlarge the corpus. Sketch Engine will make sure no page, text or part of text is included twice (deduplication).
The allowlist keywords can be useful to avoid ambiguity of the seed words, i.e. you can make some of the unambiguous seed words compulsory to make sure the document matches the topic.
Denylist keywords can also be used to reduce ambiguity (e.g. you might use “party” when collecting a corpus on the environment using seeds which include “green”). It is only necessary to use the denylist and whitelists if you are getting irrelevant documents, otherwise it is not necessary.