As a rule of thumb, do not worry about the advanced settings and use the default settings. Only if the results do not produce the results you need, start looking into the advanced settings.
You can repeat the same procedure several times to enlarge the corpus. Sketch Engine will make sure no page, text or part of text is included twice (deduplication).
The white list keywords can be useful to avoid ambiguity of the seed words, i.e. you can make some of the unambiguous seed words compulsory to make sure the document matches the topic.
Black list keywords can also be used to reduce ambiguity (e.g. you might use “party” when collecting a corpus on the environment using seeds which include “green”). It is only necessary to use the whitelist and blacklists if you are getting irrelevant documents, otherwise it is not necessary.