deduplication

is a process during which repeating same texts are removed and the only first text of all same (duplicated) texts is kept. The deduplication process can be carried out at various levels, e.g. documents. It means that one whole document of two same ones will be removed.