I am developing an application that allows you to record video from a webcam and audio from a microphone. I use QT, but, unfortunately, the camera module does not work on windows, which forced me to use ffmpeg to record video / audio.
My Camera module now works well, in addition to a small synchronization problem. Sometimes audio and video are not synchronized with a slight difference (I would say less than 1 second, although this could be worse with longer recordings).
When I encode frames, I add PTS as follows (which I took from the muxing.c example):
- For video frames, I increment PTS in turn (starting at 0).
- For audio frames, I increase the PTS by
nb_samples sound frame (starting at 0).
I save the file at 25 frames per second and ask the camera to give me 25 frames per second (which can). I also convert video frames to YUV420P format. To convert sound frames, I need to use AVAudioFifo , because the microphone sends larger samples than the mp4 stream supports, so I have to split them into chuncks. For this, I used the transcode.c example.
I have no idea what I should do to synchronize audio and video. Do I need to use a clock or something to correctly synchronize both streams?
The full code is too large to post here, but if necessary, I can add it to github, for example.
Here is the code to write the frame:
int FFCapture::writeFrame(const AVRational *time_base, AVStream *stream, AVPacket *pkt) { av_packet_rescale_ts(pkt, *time_base, stream->time_base); pkt->stream_index = stream->index; return av_interleaved_write_frame(oFormatContext, pkt); }
Code for getting past tense:
qint64 FFCapture::getElapsedTime(qint64 *previousTime) { qint64 newTime = timer.elapsed(); if(newTime > *previousTime) { *previousTime = newTime; return newTime; } return -1; }
Code for adding PTS (video and audio stream, respectively):
qint64 time = getElapsedTime(&previousVideoTime); if(time >= 0) outFrame->pts = time; //if(time >= 0) outFrame->pts = av_rescale_q(time, outStream.videoStream->codec->time_base, outStream.videoStream->time_base); qint64 time = getElapsedTime(&previousAudioTime); if(time >= 0) { AVRational aux; aux.num = 1; aux.den = 1000; outFrame->pts = time; //outFrame->pts = av_rescale_q(time, aux, outStream.audioStream->time_base); }