Creating video with images using ffmpeg libav and libx264?

Question

Creating video with images using ffmpeg libav and libx264?

I am trying to create a video with images using the ffmpeg library. Images are 1920x1080 and must be encoded using H.264 using the .mkv container. I ran into various problems thinking that I was getting closer to a solution, but that I was really stuck. With the settings that I use, the first X-frames (about 40, depending on how many and how many images I use for the video) of my video are not encoded. avcodec_encode_video2 does not return any error (return value is 0) with got_picture_ptr = 0. The result is a video that really looks as expected, but jumps strange for the first seconds.

So this is how I create a video file:

// m_codecContext is an instance variable of type AVCodecContext * // m_formatCtx is an instance variable of type AVFormatContext * // outputFileName is a valid filename ending with .mkv AVOutputFormat *oformat = av_guess_format(NULL, outputFileName, NULL); if (oformat == NULL) { oformat = av_guess_format("mpeg", NULL, NULL); } // oformat->video_codec is AV_CODEC_ID_H264 AVCodec *codec = avcodec_find_encoder(oformat->video_codec); m_codecContext = avcodec_alloc_context3(codec); m_codecContext->codec_id = oformat->video_codec; m_codecContext->codec_type = AVMEDIA_TYPE_VIDEO; m_codecContext->gop_size = 30; m_codecContext->bit_rate = width * height * 4 m_codecContext->width = width; m_codecContext->height = height; m_codecContext->time_base = (AVRational){1,frameRate}; m_codecContext->max_b_frames = 1; m_codecContext->pix_fmt = AV_PIX_FMT_YUV420P; m_formatCtx = avformat_alloc_context(); m_formatCtx->oformat = oformat; m_formatCtx->video_codec_id = oformat->video_codec; snprintf(m_formatCtx->filename, sizeof(m_formatCtx->filename), "%s", outputFileName); AVStream *videoStream = avformat_new_stream(m_formatCtx, codec); if(!videoStream) { printf("Could not allocate stream\n"); } videoStream->codec = m_codecContext; if(m_formatCtx->oformat->flags & AVFMT_GLOBALHEADER) { m_codecContext->flags |= CODEC_FLAG_GLOBAL_HEADER; } avcodec_open2(m_codecContext, codec, NULL) < 0); avio_open(&m_formatCtx->pb, outputFileName.toStdString().c_str(), AVIO_FLAG_WRITE); avformat_write_header(m_formatCtx, NULL);

here's how the frames stack up:

 void VideoCreator::writeImageToVideo(const QSharedPointer<QImage> &img, int frameIndex) { AVFrame *frame = avcodec_alloc_frame(); /* alloc image and output buffer */ int size = m_codecContext->width * m_codecContext->height; int numBytes = avpicture_get_size(m_codecContext->pix_fmt, m_codecContext->width, m_codecContext->height); uint8_t *outbuf = (uint8_t *)malloc(numBytes); uint8_t *picture_buf = (uint8_t *)av_malloc(numBytes); int ret = av_image_fill_arrays(frame->data, frame->linesize, picture_buf, m_codecContext->pix_fmt, m_codecContext->width, m_codecContext->height, 1); frame->data[0] = picture_buf; frame->data[1] = frame->data[0] + size; frame->data[2] = frame->data[1] + size/4; frame->linesize[0] = m_codecContext->width; frame->linesize[1] = m_codecContext->width/2; frame->linesize[2] = m_codecContext->width/2; fflush(stdout); for (int y = 0; y < m_codecContext->height; y++) { for (int x = 0; x < m_codecContext->width; x++) { unsigned char b = img->bits()[(y * m_codecContext->width + x) * 4 + 0]; unsigned char g = img->bits()[(y * m_codecContext->width + x) * 4 + 1]; unsigned char r = img->bits()[(y * m_codecContext->width + x) * 4 + 2]; unsigned char Y = (0.257 * r) + (0.504 * g) + (0.098 * b) + 16; frame->data[0][y * frame->linesize[0] + x] = Y; if (y % 2 == 0 && x % 2 == 0) { unsigned char V = (0.439 * r) - (0.368 * g) - (0.071 * b) + 128; unsigned char U = -(0.148 * r) - (0.291 * g) + (0.439 * b) + 128; frame->data[1][y/2 * frame->linesize[1] + x/2] = U; frame->data[2][y/2 * frame->linesize[2] + x/2] = V; } } } int pts = frameIndex;//(1.0 / 30.0) * 90.0 * frameIndex; frame->pts = pts;//av_rescale_q(m_codecContext->coded_frame->pts, m_codecContext->time_base, formatCtx->streams[0]->time_base); //(1.0 / 30.0) * 90.0 * frameIndex; int got_packet_ptr; AVPacket packet; av_init_packet(&packet); packet.data = outbuf; packet.size = numBytes; packet.stream_index = formatCtx->streams[0]->index; packet.flags |= AV_PKT_FLAG_KEY; packet.pts = packet.dts = pts; m_codecContext->coded_frame->pts = pts; ret = avcodec_encode_video2(m_codecContext, &packet, frame, &got_packet_ptr); if (got_packet_ptr != 0) { m_codecContext->coded_frame->pts = pts; // Set the time stamp if (m_codecContext->coded_frame->pts != (0x8000000000000000LL)) { pts = av_rescale_q(m_codecContext->coded_frame->pts, m_codecContext->time_base, formatCtx->streams[0]->time_base); } packet.pts = pts; if(m_codecContext->coded_frame->key_frame) { packet.flags |= AV_PKT_FLAG_KEY; } std::cout << "pts: " << packet.pts << ", dts: " << packet.dts << std::endl; av_interleaved_write_frame(formatCtx, &packet); av_free_packet(&packet); } free(picture_buf); free(outbuf); av_free(frame); printf("\n"); }

and this is cleaning:

 int numBytes = avpicture_get_size(m_codecContext->pix_fmt, m_codecContext->width, m_codecContext->height); int got_packet_ptr = 1; int ret; // for(; got_packet_ptr != 0; i++) while (got_packet_ptr) { uint8_t *outbuf = (uint8_t *)malloc(numBytes); AVPacket packet; av_init_packet(&packet); packet.data = outbuf; packet.size = numBytes; ret = avcodec_encode_video2(m_codecContext, &packet, NULL, &got_packet_ptr); if (got_packet_ptr) { av_interleaved_write_frame(m_formatCtx, &packet); } av_free_packet(&packet); free(outbuf); } av_write_trailer(formatCtx); avcodec_close(m_codecContext); av_free(m_codecContext); printf("\n");

I assume that it can be tied to the PTS and DTS values, but I tried EVERYTHING. The frame index seems to make the most sense. The images are correct, I can easily save them in files. I'm running out of ideas. I would be incredibly grateful if there was someone who knew better than me ...

Cheers, marikaner

UPDATE:

If this helps, this will be the output at the end of the video encoding:

 [libx264 @ 0x7fffc00028a0] frame I:19 Avg QP:14.24 size:312420 [libx264 @ 0x7fffc00028a0] frame P:280 Avg QP:19.16 size:148867 [libx264 @ 0x7fffc00028a0] frame B:181 Avg QP:21.31 size: 40540 [libx264 @ 0x7fffc00028a0] consecutive B-frames: 24.6% 75.4% [libx264 @ 0x7fffc00028a0] mb I I16..4: 30.9% 45.5% 23.7% [libx264 @ 0x7fffc00028a0] mb P I16..4: 4.7% 9.1% 4.5% P16..4: 23.5% 16.6% 12.6% 0.0% 0.0% skip:28.9% [libx264 @ 0x7fffc00028a0] mb B I16..4: 0.6% 0.5% 0.3% B16..8: 26.7% 11.0% 5.5% direct: 3.9% skip:51.5% L0:39.4% L1:45.0% BI:15.6% [libx264 @ 0x7fffc00028a0] final ratefactor: 19.21 [libx264 @ 0x7fffc00028a0] 8x8 transform intra:48.2% inter:47.3% [libx264 @ 0x7fffc00028a0] coded y,uvDC,uvAC intra: 54.9% 53.1% 30.4% inter: 25.4% 13.5% 4.2% [libx264 @ 0x7fffc00028a0] i16 v,h,dc,p: 41% 29% 11% 19% [libx264 @ 0x7fffc00028a0] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 16% 26% 31% 3% 4% 3% 7% 3% 6% [libx264 @ 0x7fffc00028a0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 30% 26% 14% 4% 5% 4% 7% 4% 7% [libx264 @ 0x7fffc00028a0] i8c dc,h,v,p: 58% 26% 13% 3% [libx264 @ 0x7fffc00028a0] Weighted P-Frames: Y:17.1% UV:3.6% [libx264 @ 0x7fffc00028a0] ref P L0: 63.1% 21.4% 11.4% 4.1% 0.1% [libx264 @ 0x7fffc00028a0] ref B L0: 85.7% 14.3% [libx264 @ 0x7fffc00028a0] kb/s:27478.30

+6

ffmpeg video-encoding libav h.264 libx264

marikaner Jul 23 '13 at 17:03

source share

1 answer

Hrishikesh_Pardeshi · Answer 1 · 2013-07-24T12:10:36+0000

Libav probably delays the processing of source frames. It is good practice to check for any delayed frames after all frames have finished processing. This is done as follows:

 int i=NUMBER_OF_FRAMES_PREVIOUSLY_ENCODED for(; got_packet_ptr; i++) ret = avcodec_encode_video2(m_codecContext, &packet, NULL, &got_packet_ptr); //Write the packets to a container after this.

The point should pass a NULL pointer instead of the encoded frame and continue to do so until the packet you receive is empty. See this link for sample code - the part under “get delayed frames”.

A simpler way would be to set the number of frames b to 0.

 m_codecContext->max_b_frames = 0;

Let me know if this works well.

Also, you have not used the libx264 API at all. You can use the libx264 APIs to encode videos, they have a simpler and more understandable syntax. In addition, it offers you more control over settings and improved performance.

To write the video stream to the mkv container, you still have to use the libav libraries. although.

Creating video with images using ffmpeg libav and libx264?

More articles: