If you omit duplicates, this will distort the base speed of each individual object. If the training data is a representative sample of the real world, then you do not want this, because you will really train in a slightly different world (one with different base rates).
To clarify this point, consider a scenario in which there are only two different objects. Your source data contains 99 objects A and 1 object B. After throwing duplicates, you have 1 object A and 1 object B. The classifier trained on deduplicated data will differ significantly from the one trained on the source data.
My advice is to leave duplicates in the data.
source share