To get the speed of an object, you need to do two things: firstly, you need to detect the object in each image (and condense it to a centroid, as you suggested), and secondly, you need to link the detected objects to different images, Once you if you do, speed can be easily calculated with a simple equation of speed of movement = distance / time.
The association is simple if you only detect one object in each image (just assume that detection is an object), although this approach can disrupt the real world.
Finding your object is where I think you are having difficulty. If it is really as simple as a single white object on a solid black background, then the search for the centroid should be simple, just average the coordinates of all the white pixels. If you have a noisy image, you first need to do some cleaning, such as morphological closing and opening operations, to remove small noise spots.
Chris source share