HLS transcoded segments individually using FFMPEG

I am recording a continuous stream in a high-definition stream of HLS. Then I want to transcode this asynchronously into different formats / bitrates. It works for me, basically, in addition to audio artifacts appear between each segment (spaces and pops).

Here is an example ffmpeg command line:

ffmpeg -threads 1 -nostdin -loglevel verbose \ -nostdin -y -i input.ts -c:a libfdk_aac \ -ac 2 -b:a 64k -y -metadata -vn output.ts 

Checking the sample sound file shows that there is a space at the end of the sound:

End

And the beginning of the file looks suspiciously attenuated (although this may not be a problem):

Start

My suspicion is that these artifacts occur because transcoding occurs without the context of the stream as a whole.

Any ideas on how to convince FFMPEG to create a sound that fits back into the HLS stream?

** UPDATE 1 **

Here is the beginning / end of the source segment. As you can see, the start still looks the same, but the end ends for 30 seconds. I expect some degree of padding with lossy encoding, but I mean that HLS manages to do no-play (is this related to the iTunes method with user metadata?)

Original startOriginal end

** UPDATED 2 **

So, I converted both originals (128k aac to MPEG2 TS) and transcoded (64k aac to aac / adts container) to WAV and placed two side by side. This is the result:

Side-by-side startSide-by-side end

I'm not sure if this is an idea of ​​how the client will play it, but it seems a little strange that decoding a transcoded one introduces a space at the beginning and makes the segment longer. Given that they are lossy encoding, I would expect indentation to be equally present in both (if at all).

** UPDATE 3 **

According to http://en.wikipedia.org/wiki/Gapless_playback - only a few encoders support brushless - for MP3, I switched to lame in ffmpeg and the problem still seems to be gone.

For AAC (see http://en.wikipedia.org/wiki/FAAC ), I tried libfaac (unlike libfdk_aac), and it also seems to produce brushless sound. However, the quality of the latter is not so great, and I would prefer to use libfdk_aac.

+6
source share
1 answer

This is more of a conceptual answer than using explicit tools to use, sorry, but it can be useful in any case - it eliminates the problem of embedding sound artifacts by introducing more complexity into your processing level.

My suggestion would be to not split the uncompressed input sound at all, but only produce a continuous compressed stream that you connect to the audio proxy, such as an icecast2 server (or similar if icecast does not support AAC), and then split / recombine on the side proxy client using fragments of compressed audio.

So, the method here will regularly (say, every 60 seconds?) Connect to the proxy server and collect a piece of audio a little more than the period in which you conduct a survey (say, it costs 75 seconds) - this is necessary, which will be launched in parallel, since at some points two clients will be executed - it can even be launched from cron if necessary or required from the shell script ...

Once you get started, you will have a series of audio fragments that overlap a bit - you will need to do some processing to compare them and isolate the audio section in the middle, which is unique for each fragment ...

Obviously, this is a simplification, but assuming that the proxy does not add any metadata information (i.e. ICY or hint data), splitting the audio in this way should allow the processed pieces to be merged without any sound artifacts, since there is only one set of output data for the original audio input and their comparison will be shy, since in fact, anyway, as for the format, these are just bytes.

The advantage here is that you disconnected the audio encoder from the client, so if you want to start some other process in parallel with transcoding to different formats or bitrates or more aggressively cut the stream for some other consumer, then nothing changes to side of the proxy encoder - you simply add another client to the proxy server using a tool chain similar to that described above.

0
source

Source: https://habr.com/ru/post/944871/


All Articles