Tag Archives: AU

Building a Tone Generator for iOS using Audio Units

Over the past month or so, I’ve been working with a friend and colleague, George Hufnagl, on building an iOS app for audiophiles that includes several useful functions and references for both linear and interactive media.  The app is called “Pocket Audio Tools” and will be available sometime in mid-August for iPhone, with iPad native and OS X desktop versions to follow.  Among the functions included in the app is one that displays frequency values for all pitches in the MIDI range with adjustable A4 tuning.  As an extension of this, we decided to include playback of a pure sine wave for any pitch selected as further reference.  While standard audio playback on iOS is quite straightforward, building a tone generator requires manipulation of the actual sample data, and for that level of control, you need to access the lowest audio layer on iOS — Audio Units.

To do this, we set up an Objective-C class (simply derived from NSObject) that takes care of all the operations we need to initialize and play back our sine tone.  To keep things reasonably short, I’m just going to focus on the specifics of initializing and using Audio Units.

Here is what we need to initialize our tone generator; we’re setting up an output unit with input supplied by a callback function.

AudioComponentDescription defaultOutputDescription = { 0 };
defaultOutputDescription.componentType = kAudioUnitType_Output;
defaultOutputDescription.componentSubType = kAudioUnitSubType_RemoteIO;
defaultOutputDescription.componentManufacturer = kAudioUnitManufacturer_Apple;
defaultOutputDescription.componentFlags = 0;
defaultOutputDescription.componentFlagsMask = 0;

AudioComponent defaultOutput = AudioComponentFindNext(NULL, &defaultOutputDescription);
OSErr error = AudioComponentInstanceNew(defaultOutput, &_componentInstance);

Above, we set our type and sub type to an output Audio Unit in which its I/O interfaces directly with iOS (specified by RemoteIO).  Currently on iOS, no third-party AUs are allowed, so the only manufacturer is Apple.  The two flag fields are set to 0 as per the Apple documentation.  Next we search for the default Audio Component by passing in NULL as the first argument to the function call, and then we initialize the instance.

Once an Audio Unit instance has been created, it’s properties are customized through calls to AudioUnitSetProperty().  Here we set up the callback function used to populate the buffers with the sine wave for our tone generator.

AURenderCallbackStruct input;
input.inputProc = RenderTone;
input.inputProcRefCon = (__bridge void *)(self);
error = AudioUnitSetProperty(_componentInstance, kAudioUnitProperty_SetRenderCallback, kAudioUnitScope_Input, 0, &input, sizeof(input));

After setting the callback function RenderTone (we’ll define this function later) and our user data (self in this case because we want the instance of the class we’re defining to be our user data in the callback routine), we set this property to the AU instance.  The scope of the Audio Unit refers to the context for which the property applies, in this case input.  The “0” argument is the element number, and we want the first one, so we specify “0”.

Next we need to specify the type of stream the audio callback function will be expecting.  Again we use the AudioUnitSetProperty() function call, this time passing in an instance of  AudioStreamBasicDescription.

AudioStreamBasicDescription streamFormat = { 0 };
streamFormat.mSampleRate = _sampleRate;
streamFormat.mFormatID = kAudioFormatLinearPCM;
streamFormat.mFormatFlags = kAudioFormatFlagsNativeFloatPacked | kAudioFormatFlagIsNonInterleaved;
streamFormat.mBytesPerPacket = 4; // 4 bytes for 'float'
streamFormat.mBytesPerFrame = 4; // sizeof(float) * 1 channel
streamFormat.mFramesPerPacket = 1; // 1 channel
streamFormat.mChannelsPerFrame = 1; // 1 channel
streamFormat.mBitsPerChannel = 8 * 4; // 1 channel * 8 bits/byte * sizeof(float)
error = AudioUnitSetProperty(_componentInstance, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input, 0,
                                     &streamFormat, sizeof(AudioStreamBasicDescription));

The format identifier in our case is linear PCM since we’ll be dealing with non-compressed audio data.  We’ll also be using native floating point as our sample data, and because we’ll be using only one channel, our buffers don’t need to be interleaved.  The bytes per packet field is set to 4 since we’re using floating point numbers (which is of course 4 bytes in size), and bytes per frame is the same because in mono format, 1 frame is equal to 1 sample.  In uncompressed formats, we need 1 frame per packet, and we specify 1 channel for mono.  Bits per channel is just calculated as 8 * sizeof our data type, which is float.

This completes the setup of our Audio Unit instance, and we then just initialize it with a call to

AudioUnitInitialize(_componentInstance);

Before looking at the callback routine, there is one extra feature that I included in this tone generator.  Simply playing back a sine wave results in noticeable clicks and pops in the audio due to the abrupt amplitude changes, so I added fade ins and fade outs to the playback.  This is accomplished simply by storing arrays of fade in and fade out amplitude values that shape the audio based on the current playback state.  For our purposes, setting a base amplitude level of 0.85 and using linear interpolation to calculate the arrays using this value is sufficient.  Now we can examine the callback function.

OSStatus RenderTone (void* inRefCon, AudioUnitRenderActionFlags* ioActionFlags,
                     const AudioTimeStamp* inTimeStamp,
                     UInt32 inBusNumber,
                     UInt32 inNumberFrames,
                     AudioBufferList* ioData)
{
    PAToneGenerator *toneGenerator = (__bridge PAToneGenerator*)(inRefCon);
    Float32 *buffer = (Float32*)ioData->mBuffers[0].mData;

    for (UInt32 frame = 0; frame < inNumberFrames; ++frame)
    {

We only need to consider the following arguments: inRefCon, inNumberFrames, and ioData.  First we need to cast our user data to the right type (our tone generator class), and then get our buffer that we will be filling with data.  From here, we need to determine the amplitude of our sine wave based on the playback state.

switch ( toneGenerator->_state ) {
    case TG_STATE_IDLE:
        toneGenerator->_amplitude = 0.f;
        break;

    case TG_STATE_FADE_IN:
        if ( toneGenerator->_fadeInPosition < kToneGeneratorFadeInSamples ) {
            toneGenerator->_amplitude = toneGenerator->_fadeInCurve[toneGenerator->_fadeInPosition];
            ++toneGenerator->_fadeInPosition;
        } else {
            toneGenerator->_fadeInPosition = 0;
            toneGenerator->_state = TG_STATE_SUSTAINING;
        }
        break;

    case TG_STATE_FADE_OUT:
        if ( toneGenerator->_fadeOutPosition < kToneGeneratorFadeOutSamples ) {
            toneGenerator->_amplitude = toneGenerator->_fadeOutCurve[toneGenerator->_fadeOutPosition];
            ++toneGenerator->_fadeOutPosition;
        } else {
            toneGenerator->_fadeOutPosition = 0;
            toneGenerator->_state = TG_STATE_IDLE;
        }
        break;

    case TG_STATE_SUSTAINING:
        toneGenerator->_amplitude = kToneGeneratorAmplitude;
        break;

    default:
        toneGenerator->_amplitude = 0.f;
        break;
}

Once we have the amplitude, we simply fill the buffer with the sine wave.

buffer[frame] = sinf(toneGenerator->_phase) * toneGenerator->_amplitude;

toneGenerator->_phase += toneGenerator->_phase_incr;
toneGenerator->_phase = ( toneGenerator->_phase > kTwoPi ? toneGenerator->_phase - kTwoPi : toneGenerator->_phase );

One important thing to bear in mind is that the callback function is continually called after we start the Audio Unit with

AudioOutputUnitStart(_componentInstance);

We don’t need to start and stop the entire instance each time a tone is played because that could very well introduce some delays in playback, so the callback continually runs in the background while the Audio Unit is active.  That way, calls to play and stop the tone are as simple as changing the playback state.

- (void)playTone
{
    if ( _isActive) {
        _state = TG_STATE_FADE_IN;
    }
}

- (void)stopTone
{
    if ( _isActive ) {
        if ( _state == TG_STATE_FADE_IN ) {
            _state = TG_STATE_IDLE;
        } else {
            _state = TG_STATE_FADE_OUT;
        }
    }
}

The reason for the extra check in stopTone() is to prevent clicks from occuring if the playback time is very short.  In other words, if the playback state has not yet reached its sustain point, we don’t want any sound output.

To finish off, we can stop the Audio Unit by calling

AudioOutputUnitStop(_componentInstance);

and free the resources with calls to

AudioUnitUninitialize(_componentInstance);
AudioComponentInstanceDispose(_componentInstance);
_componentInstance = nil;

That completes this look at writing a tone generator for output on iOS.  Apple’s Audio Units layer is a sophisticated and flexible system that can be used for all sorts of audio needs including effects, analysis, and synthesis.  Keep an eye out for Pocket Audio Tools, coming soon to iOS!

Advertisements

Building a Comb Filter in Audio Units

Now as I am looking into and learning more about digital reverberation, including its implementation and theory, I decided to build a simple comb filter plug-in using Audio Units.  Previously all the plug-in work I’ve done has been using VST, but I was anxious to learn another side of plug-in development, hence Apple’s Audio Units.  It is, truth be told, very similar to VST development in that you derive your plug-in as a subclass of Audio Unit’s AUEffectBase class, inheriting and overwriting functions accordingly to the needs of your effect.  There are some notable differences, however, that are worth pointing out.  In addition, I’ve put up the plug-in available for download on the Downloads page.

The structure of an Audio Unit differs from VST in that within the main interface of the plug-in, a kernel object that is derived from AUKernelBase handles the actual DSP processing.  The outer interface as subclassed from AUEffectBase handles the view, parameters, and communication with the host.  What’s interesting about this method is that the Audio Unit automatically handles multichannel audio streams by initializing new kernels.  This means that the code you write within the Process() function of the kernel object is written as if to handle mono audio data.  When the plug-in detects stereo data it simply initializes another kernel to process the additional channel.  For n-to-n channel effects, this works well.  Naturally options are available for effects or instruments that require n-to-m channel output.

Another benefit of this structure is the generally fast load times of Audio Unit plug-ins.  The plug-in’s constructor, invoked during its instantiation, should not contain any code that requires heavy lifting.  Instead this should be placed within the kernel’s constructor, the initialization, so that any heavy processing will only occur when the user is ready for it.  Acquring the delay buffer in the comb filter happens in the kernel’s constructor, as indicated below, while the plug-in’s constructor only sets up the initial parameter values and presets.

Comb Filter kernel constructor

Comb Filter base constructor

The parameters in Audio Units also differ from VST in that they are not forced to be floating point values that the programmer is responsible for mapping for the purpose of displaying in the UI.  Audio Units comes with built-in categories for parameters which allow you to declare minimum and maximum values for in addition to a default value that is used for when the plug-in instantiates.

Declaring parameters in GetParameterInfo()

Like VST, Audio Units contains a function called Reset() that is called whenever the user starts or stops playback.  This is where you would clear buffers or reset any variables needed to return the plug-in to an initialized state to avoid any clicks, pops, or artifacts when playback is resumed.

Performing clean-up in Reset()

Because a comb filter is essentially a form of delay, a circular buffer is used (mDelayBuf) to hold the delayed audio samples.  In real-time processing where the delay time can change, however, this has repercussions on the size of the buffer used, as it would normally be allocated to the exact number of samples needed to hold the data.  But rather than deallocating and reallocating the delay buffer every time the delay time changes (requiring multiple memory accesses), I allocate the buffer to its maximum possible size as given by the maximum value allowed for the delay time.  As the delay time changes, I keep track of its size with the curBufSize variable, and it is this value that I use to wrap around the buffer’s cursor position (mPos).  This happens within the Process() function.

Comb Filter’s Process() function

Every time Process() is called (which is every time the host sends a new block of samples to the plug-in), it updates the current size of the buffer and checks to make sure that mPos does not exceed it.  The unfortunate consequence of varying the delay time of an effect such as this is that it results in pops and artifacting when it is changed in real time.  The reason being that when the delay time is changed in real time, samples are lost or skipped over, resulting in non-contiguous samples causing artifacting.  This could be remedied by implementing the Comb Filter as a variable delay, meaning when the delay time changes in real time, interpolation is used to fill in the gaps.  As it stands, however, the delay time is not practically suited for automation.

Yet another distinction with Audio Units is the requirement for validation to be usable in a host.  Audio Units are managed by OS X’s Component Manager, and this is where hosts check for Audio Unit plug-ins.  To validate an Audio Unit, a tool called “auval” is used.  This method has both pros and cons to it.  The testing procedure helps to ensure any plug-in behaves well in a host, it shouldn’t cause crashes or result in memory leaks.  While I doubt this method is foolproof, it is definitely useful to make sure your plug-in is secure.

Correction: Audio Units no longer use the Component Manager in OS X 10.7+. Here is a technical note from Apple on adapting to the new AUPlugIn entry point.

The downside to it is that some hosts, especially Logic, can be really picky with which plug-ins it accepts.  I had problems loading the Comb Filter plug-in for the simple reason that version numbers didn’t match (since I was going back and forth between debug and release versions), and so it failed Logic’s validation process.  To remedy this, I had to clear away the plug-in from its location in /Library/Audio/Plug-Ins/Components and then, after reinstalling it, open the AU Manager in Logic to force it to check the new version.  This got to be a little frustrating after having to add/remove versions of the plug-in for testing, especially since it passed successfully in auval.  Fortunately it is all up and running now, though!

Comb Filter plug-in in Logic 8

Finally, I’ll end this post with some examples of me “monkey-ing” around with the plug-in in Logic 8, using some of the factory presets I built into it.

Comb Filter, metallic ring preset

Comb Filter, light delay preset

Comb Filter, wax comb preset