PCM Encoding (CMSampleBufferRef) in AAC on iOS - How to set frequency and bit rate?

Question

PCM Encoding (CMSampleBufferRef) in AAC on iOS - How to set frequency and bit rate?

I want to encode PCM ( CMSampleBufferRef (s) going live from AVCaptureAudioDataOutputSampleBufferDelegate ) in AAC.

When the first CMSampleBufferRef , I set (in / out) AudioStreamBasicDescription (s), "out" according to the documentation

 AudioStreamBasicDescription inAudioStreamBasicDescription = *CMAudioFormatDescriptionGetStreamBasicDescription((CMAudioFormatDescriptionRef)CMSampleBufferGetFormatDescription(sampleBuffer)); AudioStreamBasicDescription outAudioStreamBasicDescription = {0}; // Always initialize the fields of a new audio stream basic description structure to zero, as shown here: ... outAudioStreamBasicDescription.mSampleRate = 44100; // The number of frames per second of the data in the stream, when the stream is played at normal speed. For compressed formats, this field indicates the number of frames per second of equivalent decompressed data. The mSampleRate field must be nonzero, except when this structure is used in a listing of supported formats (see "kAudioStreamAnyRate"). outAudioStreamBasicDescription.mFormatID = kAudioFormatMPEG4AAC; // kAudioFormatMPEG4AAC_HE does not work. Can't find `AudioClassDescription`. `mFormatFlags` is set to 0. outAudioStreamBasicDescription.mFormatFlags = kMPEG4Object_AAC_SSR; // Format-specific flags to specify details of the format. Set to 0 to indicate no format flags. See "Audio Data Format Identifiers" for the flags that apply to each format. outAudioStreamBasicDescription.mBytesPerPacket = 0; // The number of bytes in a packet of audio data. To indicate variable packet size, set this field to 0. For a format that uses variable packet size, specify the size of each packet using an AudioStreamPacketDescription structure. outAudioStreamBasicDescription.mFramesPerPacket = 1024; // The number of frames in a packet of audio data. For uncompressed audio, the value is 1. For variable bit-rate formats, the value is a larger fixed number, such as 1024 for AAC. For formats with a variable number of frames per packet, such as Ogg Vorbis, set this field to 0. outAudioStreamBasicDescription.mBytesPerFrame = 0; // The number of bytes from the start of one frame to the start of the next frame in an audio buffer. Set this field to 0 for compressed formats. ... outAudioStreamBasicDescription.mChannelsPerFrame = 1; // The number of channels in each frame of audio data. This value must be nonzero. outAudioStreamBasicDescription.mBitsPerChannel = 0; // ... Set this field to 0 for compressed formats. outAudioStreamBasicDescription.mReserved = 0; // Pads the structure out to force an even 8-byte alignment. Must be set to 0.

and AudioConverterRef .

 AudioClassDescription audioClassDescription; memset(&audioClassDescription, 0, sizeof(audioClassDescription)); UInt32 size; NSAssert(AudioFormatGetPropertyInfo(kAudioFormatProperty_Encoders, sizeof(outAudioStreamBasicDescription.mFormatID), &outAudioStreamBasicDescription.mFormatID, &size) == noErr, nil); uint32_t count = size / sizeof(AudioClassDescription); AudioClassDescription descriptions[count]; NSAssert(AudioFormatGetProperty(kAudioFormatProperty_Encoders, sizeof(outAudioStreamBasicDescription.mFormatID), &outAudioStreamBasicDescription.mFormatID, &size, descriptions) == noErr, nil); for (uint32_t i = 0; i < count; i++) { if ((outAudioStreamBasicDescription.mFormatID == descriptions[i].mSubType) && (kAppleSoftwareAudioCodecManufacturer == descriptions[i].mManufacturer)) { memcpy(&audioClassDescription, &descriptions[i], sizeof(audioClassDescription)); } } NSAssert(audioClassDescription.mSubType == outAudioStreamBasicDescription.mFormatID && audioClassDescription.mManufacturer == kAppleSoftwareAudioCodecManufacturer, nil); AudioConverterRef audioConverter; memset(&audioConverter, 0, sizeof(audioConverter)); NSAssert(AudioConverterNewSpecific(&inAudioStreamBasicDescription, &outAudioStreamBasicDescription, 1, &audioClassDescription, &audioConverter) == 0, nil);

And then I convert each CMSampleBufferRef to raw AAC data.

 AudioBufferList inAaudioBufferList; CMBlockBufferRef blockBuffer; CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampleBuffer, NULL, &inAaudioBufferList, sizeof(inAaudioBufferList), NULL, NULL, 0, &blockBuffer); NSAssert(inAaudioBufferList.mNumberBuffers == 1, nil); uint32_t bufferSize = inAaudioBufferList.mBuffers[0].mDataByteSize; uint8_t *buffer = (uint8_t *)malloc(bufferSize); memset(buffer, 0, bufferSize); AudioBufferList outAudioBufferList; outAudioBufferList.mNumberBuffers = 1; outAudioBufferList.mBuffers[0].mNumberChannels = inAaudioBufferList.mBuffers[0].mNumberChannels; outAudioBufferList.mBuffers[0].mDataByteSize = bufferSize; outAudioBufferList.mBuffers[0].mData = buffer; UInt32 ioOutputDataPacketSize = 1; NSAssert(AudioConverterFillComplexBuffer(audioConverter, inInputDataProc, &inAaudioBufferList, &ioOutputDataPacketSize, &outAudioBufferList, NULL) == 0, nil); NSData *data = [NSData dataWithBytes:outAudioBufferList.mBuffers[0].mData length:outAudioBufferList.mBuffers[0].mDataByteSize]; free(buffer); CFRelease(blockBuffer);

inInputDataProc() implementation:

 OSStatus inInputDataProc(AudioConverterRef inAudioConverter, UInt32 *ioNumberDataPackets, AudioBufferList *ioData, AudioStreamPacketDescription **outDataPacketDescription, void *inUserData) { AudioBufferList audioBufferList = *(AudioBufferList *)inUserData; ioData->mBuffers[0].mData = audioBufferList.mBuffers[0].mData; ioData->mBuffers[0].mDataByteSize = audioBufferList.mBuffers[0].mDataByteSize; return noErr; }

Now, data contains my raw AAC, which I transfer to an ADTS frame with the corresponding ADTS header, and the sequence of these ADTS frames is a playable AAC document.

But I do not understand this code as much as I want. In general, I don’t understand audio ... I somehow wrote it somehow on blogs, forums and documents, in quite a lot of time, and now it works, but I don’t know why and how to change some parameters. So here are my questions:

I need to use this converter while the HW encoder is AVAssetWriter (via AVAssetWriter ). This is why I am doing the SW converter through AudioConverterNewSpecific() , not AudioConverterNew() . But now setting outAudioStreamBasicDescription.mFormatID = kAudioFormatMPEG4AAC_HE; does not work. Cannot find AudioClassDescription . Even if mFormatFlags set to 0. What am I losing using kAudioFormatMPEG4AAC ( kMPEG4Object_AAC_SSR ) over kAudioFormatMPEG4AAC_HE ? What should I use for live streaming? kMPEG4Object_AAC_SSR or kMPEG4Object_AAC_Main ?
How to change the sampling rate? If I set outAudioStreamBasicDescription.mSampleRate to 22050 or 8000, for example, playing sound as slow. I set the sampling rate index in the ADTS header to the same frequency as outAudioStreamBasicDescription.mSampleRate .
How to change the bitrate? ffmpeg -i shows this information for the created aac: Stream #0:0: Audio: aac, 44100 Hz, mono, fltp, 64 kb/s . How to change it to 16 Kbps, for example? Bit rate decreases as I decrease the frequency, but I think this is not the only way? And the playback gets damaged, reducing the frequency, as I mention in 2 anyway.
How to calculate buffer size? Now I set it to uint32_t bufferSize = inAaudioBufferList.mBuffers[0].mDataByteSize; , since I think that a compressed format will not be more than an uncompressed one ... But is it too redundant?
How to set ioOutputDataPacketSize ? If I get the documentation correctly, I have to set it as UInt32 ioOutputDataPacketSize = bufferSize / outAudioStreamBasicDescription.mBytesPerPacket; but mBytesPerPacket is 0. If I set it to 0, AudioConverterFillComplexBuffer() returns an error. If I set it to 1, it works, but I don't know why ...
inInputDataProc() has 3 "outside" parameters. I installed only ioData . Should I also set ioNumberDataPackets and outDataPacketDescription ? Why and how?

+7

ios audio aac core-audio audiotoolbox

user500 Nov 08 '13 at 0:29

source share

1 answer

hotpaw2 · Answer 1 · 2013-11-08T20:18:54+0000

You may need to change the sampling rate of the raw audio data using the oversampling audio computer before applying audio to the AAC. Otherwise, there will be a mismatch between the AAC header and the audio data.

PCM Encoding (CMSampleBufferRef) in AAC on iOS - How to set frequency and bit rate?

More articles: