I want to encode PCM ( CMSampleBufferRef (s) going live from AVCaptureAudioDataOutputSampleBufferDelegate ) in AAC.
When the first CMSampleBufferRef , I set (in / out) AudioStreamBasicDescription (s), "out" according to the documentation
AudioStreamBasicDescription inAudioStreamBasicDescription = *CMAudioFormatDescriptionGetStreamBasicDescription((CMAudioFormatDescriptionRef)CMSampleBufferGetFormatDescription(sampleBuffer)); AudioStreamBasicDescription outAudioStreamBasicDescription = {0};
and AudioConverterRef .
AudioClassDescription audioClassDescription; memset(&audioClassDescription, 0, sizeof(audioClassDescription)); UInt32 size; NSAssert(AudioFormatGetPropertyInfo(kAudioFormatProperty_Encoders, sizeof(outAudioStreamBasicDescription.mFormatID), &outAudioStreamBasicDescription.mFormatID, &size) == noErr, nil); uint32_t count = size / sizeof(AudioClassDescription); AudioClassDescription descriptions[count]; NSAssert(AudioFormatGetProperty(kAudioFormatProperty_Encoders, sizeof(outAudioStreamBasicDescription.mFormatID), &outAudioStreamBasicDescription.mFormatID, &size, descriptions) == noErr, nil); for (uint32_t i = 0; i < count; i++) { if ((outAudioStreamBasicDescription.mFormatID == descriptions[i].mSubType) && (kAppleSoftwareAudioCodecManufacturer == descriptions[i].mManufacturer)) { memcpy(&audioClassDescription, &descriptions[i], sizeof(audioClassDescription)); } } NSAssert(audioClassDescription.mSubType == outAudioStreamBasicDescription.mFormatID && audioClassDescription.mManufacturer == kAppleSoftwareAudioCodecManufacturer, nil); AudioConverterRef audioConverter; memset(&audioConverter, 0, sizeof(audioConverter)); NSAssert(AudioConverterNewSpecific(&inAudioStreamBasicDescription, &outAudioStreamBasicDescription, 1, &audioClassDescription, &audioConverter) == 0, nil);
And then I convert each CMSampleBufferRef to raw AAC data.
AudioBufferList inAaudioBufferList; CMBlockBufferRef blockBuffer; CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampleBuffer, NULL, &inAaudioBufferList, sizeof(inAaudioBufferList), NULL, NULL, 0, &blockBuffer); NSAssert(inAaudioBufferList.mNumberBuffers == 1, nil); uint32_t bufferSize = inAaudioBufferList.mBuffers[0].mDataByteSize; uint8_t *buffer = (uint8_t *)malloc(bufferSize); memset(buffer, 0, bufferSize); AudioBufferList outAudioBufferList; outAudioBufferList.mNumberBuffers = 1; outAudioBufferList.mBuffers[0].mNumberChannels = inAaudioBufferList.mBuffers[0].mNumberChannels; outAudioBufferList.mBuffers[0].mDataByteSize = bufferSize; outAudioBufferList.mBuffers[0].mData = buffer; UInt32 ioOutputDataPacketSize = 1; NSAssert(AudioConverterFillComplexBuffer(audioConverter, inInputDataProc, &inAaudioBufferList, &ioOutputDataPacketSize, &outAudioBufferList, NULL) == 0, nil); NSData *data = [NSData dataWithBytes:outAudioBufferList.mBuffers[0].mData length:outAudioBufferList.mBuffers[0].mDataByteSize]; free(buffer); CFRelease(blockBuffer);
inInputDataProc() implementation:
OSStatus inInputDataProc(AudioConverterRef inAudioConverter, UInt32 *ioNumberDataPackets, AudioBufferList *ioData, AudioStreamPacketDescription **outDataPacketDescription, void *inUserData) { AudioBufferList audioBufferList = *(AudioBufferList *)inUserData; ioData->mBuffers[0].mData = audioBufferList.mBuffers[0].mData; ioData->mBuffers[0].mDataByteSize = audioBufferList.mBuffers[0].mDataByteSize; return noErr; }
Now, data contains my raw AAC, which I transfer to an ADTS frame with the corresponding ADTS header, and the sequence of these ADTS frames is a playable AAC document.
But I do not understand this code as much as I want. In general, I donβt understand audio ... I somehow wrote it somehow on blogs, forums and documents, in quite a lot of time, and now it works, but I donβt know why and how to change some parameters. So here are my questions:
I need to use this converter while the HW encoder is AVAssetWriter (via AVAssetWriter ). This is why I am doing the SW converter through AudioConverterNewSpecific() , not AudioConverterNew() . But now setting outAudioStreamBasicDescription.mFormatID = kAudioFormatMPEG4AAC_HE; does not work. Cannot find AudioClassDescription . Even if mFormatFlags set to 0. What am I losing using kAudioFormatMPEG4AAC ( kMPEG4Object_AAC_SSR ) over kAudioFormatMPEG4AAC_HE ? What should I use for live streaming? kMPEG4Object_AAC_SSR or kMPEG4Object_AAC_Main ?
How to change the sampling rate? If I set outAudioStreamBasicDescription.mSampleRate to 22050 or 8000, for example, playing sound as slow. I set the sampling rate index in the ADTS header to the same frequency as outAudioStreamBasicDescription.mSampleRate .
How to change the bitrate? ffmpeg -i shows this information for the created aac: Stream #0:0: Audio: aac, 44100 Hz, mono, fltp, 64 kb/s . How to change it to 16 Kbps, for example? Bit rate decreases as I decrease the frequency, but I think this is not the only way? And the playback gets damaged, reducing the frequency, as I mention in 2 anyway.
How to calculate buffer size? Now I set it to uint32_t bufferSize = inAaudioBufferList.mBuffers[0].mDataByteSize; , since I think that a compressed format will not be more than an uncompressed one ... But is it too redundant?
How to set ioOutputDataPacketSize ? If I get the documentation correctly, I have to set it as UInt32 ioOutputDataPacketSize = bufferSize / outAudioStreamBasicDescription.mBytesPerPacket; but mBytesPerPacket is 0. If I set it to 0, AudioConverterFillComplexBuffer() returns an error. If I set it to 1, it works, but I don't know why ...
inInputDataProc() has 3 "outside" parameters. I installed only ioData . Should I also set ioNumberDataPackets and outDataPacketDescription ? Why and how?