Pixel-level recommendations for analyzing real-time television (TV) video

Question

Pixel-level recommendations for analyzing real-time television (TV) video

[Note: this is a rewrite of an earlier question that was deemed inappropriate and closed.]

I need to do some pixel analysis of a television (TV) video. The exact nature of this analysis is not appropriate, but basically it involves viewing each pixel of each frame of a television video, starting with the MPEG-2 transport stream. The host platform will be a server, multiprocessor 64-bit Linux machine.

I need a library that can handle decoding a transport stream and present me image data in real time. OpenCV and ffmpeg are two libraries that I am considering for this work. OpenCV is attractive because I heard that it has easy-to-use APIs and support for advanced image analysis, but I have no experience using it. I used ffmpeg in the past to extract data from video files from files for analysis, but it lacks image analysis support (although Intel IPP may complement).

In addition to general guidelines for approaches to this problem (with the exception of the actual image analysis), I have a few more specific questions that will help me get started:

Are ffmpeg or OpenCV commonly used in industry as the basis for real-time video analysis, or is there something else I should look at?
Can OpenCV decode video frames in real time and still leave enough CPU left to perform non-trivial image analysis, also in real time?
It is enough to use ffpmeg to decode the MPEG-2 transport stream, or is it preferable to just use the MPEG-2 decoding library directly (and if so, which one)?
Are there specific pixel formats for output frames that ffmpeg or OpenCV is particularly efficient at producing (e.g. RGB, YUV or YUV422, etc.)?

+6

linux opencv ffmpeg video signal-processing

Randall cook Dec 05 '11 at 23:07

source share

2 answers

Only for the 4th question:

video streams are encoded in 422 format: YUV, YUV422, YCbCr, etc. Converting them to BGR and vice versa (for re-encoding) consumes a lot of time. Therefore, if you can write your own algorithms to run on YUV, you will get instant productivity gains.

Note 1 .. Although OpenCV natively supports BGR images, you can force it to handle YUV with some caution and knowledge of its internal components.

For example, if you want to detect some people in a video, just take the upper half of the decoded video buffer (it contains the grayscale image of the image) and process it.

Note 2. If you want to access the YUV image in opencv, you must use the ffmpeg API directly in your application. OpenCV forcibly converts from YUV to BGR into its VideoCapture API.

+3

Sam Dec 6 '11 at 6:40

source share

mevatron · Accepted Answer · 2011-12-06T02:49:58+0000

1.
I would recommend OpenCV for real-time image analysis. I assume that in real time you mean the ability to support the frame rate of the TV (for example, NTSC (29.97 frames per second) or PAL (25 frames per second)). Of course, as mentioned in the comments, it certainly depends on the equipment you have, as well as the image size of SD (480p) and HD (720p or 1080p). FFmpeg probably has its own quirks, but it will be difficult for you to find the best free alternative. Its power and flexibility are quite impressive; I am sure that this is one of the reasons why OpenCV developers decided to use it as an internal one for decoding / encoding video using OpenCV.

2.
I have not seen high latency problems when using OpenCV for decoding. How long can your system have? If you need to improve performance, consider using separate streams for capturing / decoding and image analysis. Since you mentioned the availability of multiprocessor systems, this should increase the efficiency of your processing capabilities. I definitely recommend using the latest Intel Core-i7 architecture (or perhaps the equivalent of Xeon) as this will give you the best performance available today.

I used OpenCV for several embedded systems, so I am well acquainted with your desire for maximum performance. I have found many times that there is no need to process a full frame (especially when defining masks). I would strongly recommend dropping images if you do not properly process the resulting video streams. This can sometimes instantly give you 4-8X acceleration (depending on your sampling factor). Also on the performance front, I definitely recommend using Intel IPP . Since OpenCV was originally a project of Intel, IPP and OpenCV combine very well.

Finally, since image processing is one of those “problematic” parallel problem fields, do not forget about the possibility of using GPUs as a hardware accelerator for your problems, if necessary. OpenCV has been working a lot on this area lately, so you need to have these tools if necessary.

3.
I think FFmpeg will be a good starting point; most of the alternatives that I can think of (Handbrake, mencoder, etc.) tend to use ffmpeg as a backend, but it looks like you could probably roll your IPP Video Coding if you want.

4.
OpenCV's internal color representation is BGR unless you use something like cvtColor to convert it. If you want to see a list of pixel formats supported by FFmpeg, you can run

ffmpeg -pix_fmts

to see what it can input and output.

Pixel-level recommendations for analyzing real-time television (TV) video

More articles: