H.264 bytestream parsing

The input is an array of bytes that represents the h.264 frame. A frame consists of one fragment (and not a multi-layer frame).

So, as I understand it, I can handle this frame as a slice. There is a header in the slice, and the slice data is macroblocks, each macroblock with its own header.

So, I have to analyze this array of bytes in order to extract the frame number, frame type, quantization coefficient (since I realized that each macroblock has its own coefficient? Or am I mistaken?)

Can you advise me where I can get more detailed information on parsing h.264 frame bytes.

(I actually read the standard, but it was not very specific, and I got lost.)

thanks

+4
source share
3 answers

The H.264 standard is a little readable, so here are some tips.

  • Read Appendix B; make sure your input starts with a startup code
  • Read section 9.1: you will need it for all of the following
  • The slice header is described in section 7.3.3.
  • The "frame number" is not explicitly encoded in the slice header; frame_num is close to what you probably want.
  • The "frame type" probably corresponds to slice_type (the second value in the slice header, so it’s most easy to parse, you should definitely start with this)
  • "Quantization coefficient" - do you mean the "quantization parameter"? If so, be prepared to write a complete H.264 parser (or reuse an existing one). See section 9.3 for an idea of ​​the complexity of the H.264 analyzer.
+15
source

The standard is very difficult to read. You can try to analyze the source code of existing software for decoding an H.264 video stream, for example ffmpeg with its C libraries (C99). For example, there is an avcodec_decode_video2 function here . You can get full working C (open file, get H.264 stream, iterate over frames, dump data, get color space, save frames as original PPM images, etc.) here . Alternatively, there is a large “Standard Version of H.264 Video Compression” that explains the standard in “human language”. Another option is to try the Elecard StreamEye Pro software (there is a trial version), which can give you an additional (visual) perspective.

+6
source

In fact, it’s much better and easier (this is just my opinion) to read the H.264 video encoding documentation. ffmpeg is a very good library, but it contains a lot of optimized code. It is better to look at the reference implementation of the H.264 codec and the official documentation. http://iphome.hhi.de/suehring/tml/download/ is a reference to the JM codec implementation. Try to separate the decoding process levels, such as the transport layer, which contains NAL units (SPS, PPS, SEI, IDR, SLICE, etc.). Than you need to implement a VLC engine (basically, exp-Golomb 0 range codes). This is a very complex and powerful codec called CABAC (Adaptive Arithmetic Binary Code Code). This is a rather difficult task. The demultiplexing process (after unpacking the video data) is also complicated. You need to fully understand each of these modules. Good luck.

+4
source

Source: https://habr.com/ru/post/1346420/


All Articles