Transcoding videos using LibAvFormat for playback on iOS devices

Question

Transcoding videos using LibAvFormat for playback on iOS devices

I am trying to transcode a video in my iOS application using FFMpeg / LibAv. What I'm trying to do is transcode the video to resize each frame and possibly lower the transfer rate in order to save valuable MB in the device.

The resulting video should play on all iPhone5 + devices.

After reading the documentation, I found out that:

I do not need to encode / decode the audio stream → Ill copy as is to the output file
I need to encode a video using h264 codec (LibX264) with a profile supported by iOS (basic profile with level 3.0 - https://trac.ffmpeg.org/wiki/Encode/H.264#Compatibility )
Im also sets the image format for the YUV plan, as it is only supported by iOS
To test Im not using any filter (Im using dummy / pass-through) at all or even trying to lower the bitrate, Im just trying to decode the video stream and encode it again
Most of the code is based on transcoding.c and filtering.c, available in the FFMpeg sample directory

FFMpeg-wise which I am trying to achieve with LibAv:

ffmpeg -i INPUT.MOV -c:v libx264 -preset ultrafast -profile:v baseline -level 3.0  -c:a copy output.MOV

(the resulting file, which can be found below, can be played in QuickTime if it is generated by FFMpeg via the command line)

The original video was created using a regular iPhone using iOS 8.2, but the problem is not specific to the device, and for iOS - for all videos created using LibAv.

VideoLan (VLC), , LibAv, QuickTime, - .

, avformat_new_stream:

AVStream *out_stream; // output stream
AVStream *in_stream; // input stream
AVCodecContext *dec_ctx, *enc_ctx; // codec context for the stream
AVCodec *encoder; // codec used
int ret;
unsigned int i;

ofmt_ctx = NULL;
// Allocate an AVFormatContext for an output format. This will be the file header (similar to avformat_open_input but with an zero'ed memory)
avformat_alloc_output_context2(&ofmt_ctx, NULL, NULL, filename);
if (!ofmt_ctx) {
    av_log(NULL, AV_LOG_ERROR, "Could not create output context\n");
    [self errorWith:kErrorCreatingOutputContext and:@"Could not create output context"];
    return AVERROR_UNKNOWN;
}
// we must not use the AVCodecContext from the video stream directly! So we have to use avcodec_copy_context() to copy the context to a new location (after allocating memory for it, of course).
// iterate over all input streams
for (i = 0; i < ifmt_ctx->nb_streams; i++) {
    in_stream = ifmt_ctx->streams[i]; // input stream
    dec_ctx = in_stream->codec; // get the codec context for the decoder

    if (dec_ctx->codec_type == AVMEDIA_TYPE_VIDEO) {
        // lets use h264
        encoder = avcodec_find_encoder(AV_CODEC_ID_H264);
        if (!encoder) {
            [self errorWith:kErrorCodecNotFound and:@"H264 Codec Not Found"];
            return AVERROR_UNKNOWN;
        }
        out_stream = avformat_new_stream(ofmt_ctx, encoder); // create a new stream with h264 codec
        if (!out_stream) {
            av_log(NULL, AV_LOG_ERROR, "Failed allocating output stream\n");
            [self errorWith:kErrorAllocateOutputStream and:@"Failed allocating output stream"];
            return AVERROR_UNKNOWN;
        }
        enc_ctx = out_stream->codec; // pointer to the stream codec context
        /* we transcode to same properties (picture size,
         * sample rate etc.). These properties can be changed for output
         * streams easily using filters */
        if (dec_ctx->codec_type == AVMEDIA_TYPE_VIDEO) {
            enc_ctx->width = dec_ctx->width;
            enc_ctx->height = dec_ctx->height;
            enc_ctx->sample_aspect_ratio = dec_ctx->sample_aspect_ratio;
            enc_ctx->pix_fmt = AV_PIX_FMT_YUV420P;
            enc_ctx->time_base = dec_ctx->time_base;
            av_opt_set(enc_ctx->priv_data, "preset", "ultrafast", 0);
            av_opt_set(enc_ctx->priv_data, "profile", "baseline", 0);
            av_opt_set(enc_ctx->priv_data, "level", "3.0", 0);
        }

        out_stream->time_base = in_stream->time_base;

        AVDictionaryEntry *tag = NULL;
        while ((tag = av_dict_get(in_stream->metadata, "", tag, AV_DICT_IGNORE_SUFFIX))) {
            printf("%s=%s\n", tag->key, tag->value);
            char *k = av_strdup(tag->key);       // if your strings are already allocated,
            char *v = av_strdup(tag->value);     // you can avoid copying them like this
            av_dict_set(&out_stream->metadata, k, v, 0);
        }

        ret = avcodec_open2(enc_ctx, encoder, NULL);
        if (ret < 0) {
            av_log(NULL, AV_LOG_ERROR, "Cannot open video encoder for stream #%u\n", i);
            [self errorWith:kErrorCantOpenOutputFile and:[NSString stringWithFormat:@"Cannot open video encoder for stream #%u",i]];
            return ret;
        }

    }
    else if(dec_ctx->codec_type == AVMEDIA_TYPE_UNKNOWN) {
        // if we cant figure out the stream type, fail
        av_log(NULL, AV_LOG_FATAL, "Elementary stream #%d is of unknown type, cannot proceed\n", i);
        [self errorWith:kErrorUnknownStream and:[NSString stringWithFormat:@"Elementary stream #%d is of unknown type, cannot proceed",i]];
        return AVERROR_INVALIDDATA;
    }
    else {
        out_stream = avformat_new_stream(ofmt_ctx, NULL);
        if (!out_stream) {
            av_log(NULL, AV_LOG_ERROR, "Failed allocating output stream\n");
            [self errorWith:kErrorAllocateOutputStream and:@"Failed allocating output stream"];
            return AVERROR_UNKNOWN;
        }
        enc_ctx = out_stream->codec;
        /* this stream must be remuxed */
        // copies ifmt_ctx->streams[i]->codec into ofmt_ctx->streams[i]->codec - Copy the settings of the source AVCodecContext into the destination AVCodecContext.
        ret = avcodec_copy_context(ofmt_ctx->streams[i]->codec,
                                   ifmt_ctx->streams[i]->codec);
        if (ret < 0) {
            av_log(NULL, AV_LOG_ERROR, "Copying stream context failed\n");
            [self errorWith:kErrorCopyStreamFailed and:@"Copying stream context failed"];
            return ret;
        }

    }
    // dunno what this is for
    if (ofmt_ctx->oformat->flags & AVFMT_GLOBALHEADER)
        enc_ctx->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;

}
if (!(ofmt_ctx->oformat->flags & AVFMT_NOFILE)) {
    // Create and initialize a AVIOContext for accessing the
    // resource indicated by url.
    ret = avio_open(&ofmt_ctx->pb, filename, AVIO_FLAG_WRITE);
    if (ret < 0) {
        av_log(NULL, AV_LOG_ERROR, "Could not open output file '%s'", filename);
        [self errorWith:kErrorCantOpenOutputFile and:[NSString stringWithFormat:@"Could not open output file '%s'", filename]];
        return ret;
    }
}

/* init muxer, write output file header */
// Allocate the stream private data and write the stream header to an output media file.
ret = avformat_write_header(ofmt_ctx, NULL);
if (ret < 0) {
    av_log(NULL, AV_LOG_ERROR, "Error occurred when opening output file\n");
    [self errorWith:kErrorOutFileCantWriteHeader and:@"Error occurred when opening output file"];
    return ret;
}
return 0;