I am new to image segmentation, but I need to do this to get the database for the machine learning classifier.
Essentially, I have a video similar to this image:

My task is to identify the cows in the foreground, or at least any cow. I understand that there is a problem with occlusion, but for the starter, I would like to correctly segment a lonely cow, for example, with a red rectangle around it (hand-drawn).
In less complex tasks like me, I distinguish by adding a threshold for each pixel, which either becomes (0,0,0) for the object, or (255,255,255) for the background:

Then I put pixels with the same values ββto get classes and get a rectangle for large enough "blocks".
For the image above, this approach will not work, since objects and background are similar + there are a lot of shadows, side lighting, etc., so I'm not sure how to approach it. Any suggestions are welcome.
source share