Well, using OpenCV, you should take a frame of the video file and do some calculations on it.
You can make several different methods for detecting a character in this image, but it is not so easy to have it as flexible, so you can even get this person if he was lying on the floor, for example, if you just entered reference images, this symbol is worth it.
Basically, you can try to extract all the important functions from your set of reference images and have (in your case, controlled) learning algorithm that gets a good vector-sign of this symbol for classification.
Then you need to write your code that plays the video, and which receives the video clip, let it say every 500 ms (or others, as you wish), gets a segmentation of the object that you are, will be your character, and compare it with a link of values ββthat you get from your learning algorithm. If there is a match, your code may scream "Yehaaawww!". or do other things ...
But it all depends on how flexible you are. You can also try a match pattern or cross-correlation, which basically shifts the reference image (s) above the frame and checks if both parts are equal. But this, unfortunately, is very sensitive to rotation, deformation or other noise ... so you wonβt get this person if you lay him down. And I doubt that you can do all these calculations in real time ...
Basically: Yes, OpenCV is good for use in your image processing and computer vision tasks. But it offers many ways and means, and you will need to find a way that works for your images ... this is not a trivial task, though ...
Hope this helps ...