Category Archives: Audio

Pure Data and libpd: Integrating with Native Code for Interactive Testing

Over the past couple of years, I’ve built up a nice library of DSP code, including effects, oscillators, and utilities. One thing that always bothered me however, is how to test this code in an efficient and reliable way. The two main methods I have used in the past have their pros and cons, but ultimately didn’t satisfy me.

One is to process an effect or generate a source into a wave file that I can open with an audio editor so I can listen to the result and examine the output. This method is okay, but it is tedious and doesn’t allow for real-time adjustment of parameters or any sort of instant feedback.

For effects like filters, I can also generate a text file containing the frequency/phase response data that I can view in a plotting application. This is useful in some ways, but this is audio — I want to hear it!

Lately I’ve gotten to know Pure Data a little more, so I thought about using it for interactive testing of my DSP modules. On its own, Pure Data does not interact with code of course, but that’s where libpd comes in. This is a great library that wraps up much of Pure Data’s functionality so that you can use it right from your own code (it works with C, C++, Objective-C, Java, and others). Here is how I integrated it with my own code to set up a nice flexible testing framework; and this is just one application of using libpd and Pure Data together — the possiblities go far beyond this!

First we start with the Pure Data patches. The receiver patch is opened and maintained in code by libpd, and has two responsiblities: 1) generate a test tone that the effect is applied to, and 2) receive messages from the control patch and dispatch them to C++.

Receiver patch, opened by libpd.

Receiver patch, opened by libpd.

The control patch is opened in Pure Data and acts as the interactive patch. It has controls for setting the frequency and volume of the synthesizer tone that acts as the source, as well as controls for the filter effect that is being tested.

Control patch, opened in Pure Data, and serves as the interactive UI for testing.

Control patch, opened in Pure Data, and serves as the interactive UI for testing.

As can be seen from the patches above, they communicate to each other via the netsend/netreceive objects by opening a port on the local machine. Since I’m only sending simple data to the receiver patch, I opted to use UDP over TCP as the network protocol. (Disclaimer: my knowledge of network programming is akin to asking “what is a for loop”).

Hopefully the purpose of these two patches is clear, so we can now move on to seeing how libpd brings it all together in code. It is worth noting that libpd does not output audio to the hardware, it only processes the data. Pure Data, for example, commonly uses Portaudio to send the audio data to the sound card, but I will be using Core Audio instead. Additionally, I’m using the C++ wrapper from libpd.

An instance of PdBase is first created with the desired intput/output channels and sample rate, and a struct contains data that will need to be held on to that will become clear further on.

struct TestData
{
    AudioUnit outputUnit;
    EffectProc effectProc;

    PdBase* pd;
    Patch pdPatch;
    float* pdBuffer;
    int pdTicks;
    int pdSamplesPerBlock;

    CFRingBuffer<float> ringBuffer;
    int maxFramesPerSlice;
    int framesInReserve;
};

int main(int argc, const char * argv[])
{
    PdBase pd;
    pd.init(0, 2, 48000); // No input needed for tests.

    TestData testData;
    testData.pd = &pd;
    testData.pdPatch = pd.openPatch("receiver.pd", ".");
}

Next, we ask Core Audio for an output Audio Unit that we can use to send audio data to the sound card.

int main(int argc, const char * argv[])
{
    PdBase pd;
    pd.init(0, 2, 48000); // No input needed for tests.

    TestData testData;
    testData.pd = &pd;
    testData.pdPatch = pd.openPatch("receiver.pd", ".");

    {
        AudioComponentDescription outputcd = {0};
        outputcd.componentType = kAudioUnitType_Output;
        outputcd.componentSubType = kAudioUnitSubType_DefaultOutput;
        outputcd.componentManufacturer = kAudioUnitManufacturer_Apple;

        AudioComponent comp = AudioComponentFindNext(NULL, &outputcd);
        if (comp == NULL)
        {
            std::cerr << "Failed to find matching Audio Unit.\n";
            exit(EXIT_FAILURE);
        }

        OSStatus error;
        error = AudioComponentInstanceNew(comp, &testData.outputUnit);
        if (error != noErr)
        {
            std::cerr << "Failed to open component for Audio Unit.\n";
            exit(EXIT_FAILURE);
        }

        Float64 sampleRate = 48000;
        UInt32 dataSize = sizeof(sampleRate);
        error = AudioUnitSetProperty(audioUnit,
                                     kAudioUnitProperty_SampleRate,
                                     kAudioUnitScope_Input,
                                     0, &sampleRate, dataSize);

        AudioUnitInitialize(audioUnit);
    }
}

The next part needs some explanation, because we need to consider how the Pure Data patch interacts with Core Audio’s render callback function that we will provide. This function will be called continuously on a high priority thread with a certain number of frames that we need to fill with audio data. Pure Data, by default, processes 64 samples per channel per block. It’s unlikely that these two numbers (the number of frames that Core Audio wants and the number of frames processed by Pure Data) will always agree. For example, in my initial tests, Core Audio specified its maximum block size to be 512 frames, but it actually asked for 470 & 471 (alternating) when it ran. Rather than trying to force the two to match block sizes, I use a ring buffer as a medium between the two — that is, read sample data from the opened Pure Data patch into the ring buffer, and then read from the ring buffer into the buffers provided by Core Audio.

Fortunately, Core Audio can be queried for the maximum number of frames it will ask for, so this will determine the number of samples we read from the Pure Data patch. We can read a multiple of Pure Data’s 64-sample block by specifying a value for “ticks” in libpd, and this value will just be equal to the maximum frames from Core Audio divided by Pure Data’s block size. The actual number of samples read/processed will of course be multiplied by the number of channels (2 in this case for stereo).

The final point on this is to handle the case where the actual number of frames processed in a block is less than the maximum. Obviously it would only take a few blocks for the ring buffer’s write pointer to catch up with the read pointer and cause horrible audio artifacts. To account for this, I make the ring buffer twice as long as the number of samples required per block to give it some breathing room, and also keep track of the number of frames in reserve currently in the ring buffer at the end of each block. When this number exceeds the number of frames being processed in a block, no processing from the patch occurs, giving the ring buffer a chance to empty out its backlog of frames.

int main(int argc, const char * argv[])
{
    <snip> // As above.

    UInt32 framesPerSlice;
    UInt32 dataSize = sizeof(framesPerSlice);
    AudioUnitGetProperty(testData.outputUnit,
                         kAudioUnitProperty_MaximumFramesPerSlice,
                         kAudioUnitScope_Global,
                         0, &framesPerSlice, &dataSize);
    testData.pdTicks = framesPerSlice / pd.blockSize();
    testData.pdSamplesPerBlock = (pd.blockSize() * 2) * testData.pdTicks; // 2 channels for stereo output.
    testData.maxFramesPerSlice = framesPerSlice;

    AURenderCallbackStruct renderCallback;
    renderCallback.inputProc = AudioRenderProc;
    renderCallback.inputProcRefCon = &testData;
    AudioUnitSetProperty(testData.outputUnit,
                         kAudioUnitProperty_SetRenderCallback,
                         kAudioUnitScope_Input,
                         0, &renderCallback, sizeof(renderCallback));

    testData.pdBuffer = new float[testData.pdSamplesPerBlock];
    testData.ringBuffer.resize(testData.pdSamplesPerBlock * 2); // Twice as long as needed in order to give it some buffer room.
    testData.effectProc = EffectProcess;
}

With the output Audio Unit and Core Audio now set up, let’s look at the render callback function. It reads the audio data from the Pure Data patch if needed into the ring buffer, which in turn fills the buffer list provided by Core Audio. The buffer list is then passed on to the callback that processes the effect being tested.

OSStatus AudioRenderProc (void *inRefCon,
                          AudioUnitRenderActionFlags *ioActionFlags,
                          const AudioTimeStamp *inTimeStamp,
                          UInt32 inBusNumber,
                          UInt32 inNumberFrames,
                          AudioBufferList *ioData)
    {
        TestData *testData = (TestData *)inRefCon;

        // Don't require input, but libpd requires valid array.
        float inBuffer[0];

        // Only read from Pd patch if the sample excess is less than the number of frames being processed.
        // This effectively empties the ring buffer when it has enough samples for the current block, preventing the
        // write pointer from catching up to the read pointer.
        if (testData->framesInReserve < inNumberFrames)
        {
            testData->pd->processFloat(testData->pdTicks, inBuffer, testData->pdBuffer);
            for (int i = 0; i < testData->pdSamplesPerBlock; ++i)
            {
                testData->ringBuffer.write(testData->pdBuffer[i]);
            }
            testData->framesInReserve += (testData->maxFramesPerSlice - inNumberFrames);
        }
        else
        {
            testData->framesInReserve -= inNumberFrames;
        }

        // NOTE: Audio data from Pd patch is interleaved, whereas Core Audio buffers are non-interleaved.
        for (UInt32 frame = 0; frame < inNumberFrames; ++frame)
        {
            Float32 *data = (Float32 *)ioData->mBuffers[0].mData;
            data[frame] = testData->ringBuffer.read();
            data = (Float32 *)ioData->mBuffers[1].mData;
            data[frame] = testData->ringBuffer.read();
        }

        if (testData->effectCallback != nullptr)
        {
            testData->effectCallback(ioData, inNumberFrames);
        }

        return noErr;
    }

Finally, let’s see the callback function that processes the filter. It’s about as simple as it gets — it just processes the filter effect being tested on the audio signal that came from Pure Data.

void EffectProcess(AudioBufferList* audioData, UInt32 numberOfFrames)
{
    for (UInt32 frame = 0; frame < numberOfFrames; ++frame)
    {
        Float32 *data = (Float32 *)audioData->mBuffers[0].mData;
        data[frame] = filter.left.sample(data[frame]);
        data = (Float32 *)audioData->mBuffers[1].mData;
        data[frame] = filter.right.sample(data[frame]);
    }
}

Not quite done yet, though, since we need to subscribe the open libpd instance of Pure Data to the messages we want to receive from the control patch. The messages received will then be dispatched inside the C++ code to handle appropriate behavior.

int main(int argc, const char * argv[])
{
    <snip> // As above.

    pd.subscribe("fromPd_filterfreq");
    pd.subscribe("fromPd_filtergain");
    pd.subscribe("fromPd_filterbw");
    pd.subscribe("fromPd_filtertype");
    pd.subscribe("fromPd_quit");

    // Start audio processing.
    pd.computeAudio(true);
    AudioOutputUnitStart(testData.outputUnit);

    bool running = true;
    while (running)
    {
        while (pd.numMessages() > 0)
        {
            Message msg = pd.nextMessage();
            switch (msg.type)
            {
                case pd::PRINT:
                    std::cout << "PRINT: " << msg.symbol << "\n";
                    break;

                case pd::BANG:
                    std::cout << "BANG: " << msg.dest << "\n";
                    if (msg.dest == "fromPd_quit")
                    {
                        running = false;
                    }
                    break;

                case pd::FLOAT:
                    std::cout << "FLOAT: " << msg.num << "\n";
                    if (msg.dest == "fromPd_filterfreq")
                    {
                        filter.left.setFrequency(msg.num);
                        filter.right.setFrequency(msg.num);
                    }
                    else if (msg.dest == "fromPd_filtertype")
                    {
                        // (filterType is just an array containing the available filter types.)
                        filter.left.setState(filterType[(unsigned int)msg.num]);
                        filter.right.setState(filterType[(unsigned int)msg.num]);
                    }
                    else if (msg.dest == "fromPd_filtergain")
                    {
                        filter.left.setGain(msg.num);
                        filter.right.setGain(msg.num);
                    }
                    else if (msg.dest == "fromPd_filterbw")
                    {
                        filter.left.setBandwidth(msg.num);
                        filter.right.setBandwidth(msg.num);
                    }
                    break;

                default:
                    std::cout << "Unknown Pd message.\n";
                    std::cout << "Type: " << msg.type << ", " << msg.dest << "\n";
                    break;
            }
        }
    }
}

Once the test has ended by banging the stop_test button on the control patch, cleanup is as follows:

int main(int argc, const char * argv[])
{
    <snip> // As above.

    pd.unsubscribeAll();
    pd.computeAudio(false);
    pd.closePatch(testData.pdPatch);

    AudioOutputUnitStop(testData.outputUnit);
    AudioUnitUninitialize(testData.outputUnit);
    AudioComponentInstanceDispose(testData.outputUnit);

    delete[] testData.pdBuffer;

    return 0;
}

The raw synth tone in the receiver patch used as the test signal is actually built with the PolyBLEP oscillator I made and discussed in a previous post. So it’s also possible (and very easy) to compile custom Pure Data externals into libpd, and that’s pretty awesome! Here is a demonstration of what I’ve been talking about — testing a state-variable filter on a raw synth tone:

Pure Data & libpd Interactive Demo from Christian on Vimeo.

Custom Pure Data External: PolyBLEP Sawtooth Oscillator

I have recently gotten back into Pure Data in the interest of familiarizing myself more with it, and perhaps to integrate it with Unity for a future project. For anyone who may not be familiar with it, more info on it can be found here. It’s a great environment for building synthesizers and experiment with a great variety of audio processing techniques in a modular, graphically-based way.

Oscillators are the foundation of synthesizers of course, and Pure Data comes with two modules that serve this purpose: osc~ for sine waves, and phasor~ for a sawtooth (it’s actually just a rectified ramp in the range 0 – 1, so it needs a couple of minor additions to turn it into a sawtooth wave). However, the phasor~ module (not being band-limited) suffers from aliasing noise. This can be avoided either by passing it through an anti-aliasing filter, or by using the sinesum message to construct a sawtooth wave according to the Fourier Theorem of summing together sine waves (i.e. creating a wavetable). Both of these methods have some drawbacks though.

Sawtooth wave using Pure Data's phasor~ module.

Sawtooth wave using Pure Data’s phasor~ module.

Constructing a sawtooth from a wavetable of 12 harmonics (source: http://en.flossmanuals.net/pure-data/).

Constructing a sawtooth from a wavetable of 12 harmonics (source: http://en.flossmanuals.net/pure-data/).

A robust anti-aliasing filter is often computationally expensive, whereas wavetables sacrifice the quality of the waveform and are not truly band-limited across its entire range (unless the wavetable is divided into sections constructed with decreasing number of harmonics as the frequency increases). A wavetable sawtooth typically lacks richness and depth due to the lack of harmonics in its lower range. These issues can be fixed by constructing the sawtooth using band-limited steps (BLEPs), which are based on band-limited impulse trains (BLITs), at the expense of some increased complexity in the algorithm. And fortunately, Pure Data allows for custom modules to be built, written in C, that can then be used in any patches just like a normal module.

The process behind this method is to construct a naive sawtooth wave using the equation

sample = (2 * (f / SR * t)) – 1

where f is frequency (Hz), SR is sample rate (Hz), and t is a time constant (in this case, the sample count). At the waveform’s discontinuities, we insert a PolyBLEP to round the sharp corners, which is calculated by a low order polynomial equation (hence, “Poly”nomial Band-Limited Step). The polynomial equation is of the form

x2 + 2x + 1

The equation for the PolyBLEP is based on the discussion of the topic on this KVR forum thread. The concensus in the thread is that the PolyBLEP is far superior to using wavetables, but sounds slightly duller than using minBLEPs (which are far more complicated still, and require precalculation of the BLEP using FFT before integrating with the naive waveform). PolyBLEP oscillators strike a good balance between high quality, minimal aliasing noise, and reasonable complexity.

Pure Data patch using the polyblep~ module for a sawtooth wave.

Pure Data patch using the polyblep~ module for a sawtooth wave.

Here is a quick sample of the PolyBLEP sawtooth recorded from Pure Data. Of course, PolyBLEPs can be used with other waveforms as well, including triangle and square waves, but the sawtooth is a popular choice for synths due to it’s rich sound.

The GitHub page can be found here, with projects for Mac (Xcode 5) and Windows (Visual Studio 2010).

Dynamics Processing: Compressor/Limiter, part 3

In part 1 of this series of posts, I went over creating an envelope detector that detects both peak amplitude and RMS values. In part 2, I used it to create a compressor/limiter. There were two common features missing from that compressor plug-in, however, that I will go over in this final part: soft knee and lookahead. Also, as I have stated in the previous parts, this effect is being created with Unity in mind, but the theory and code is easily adaptable to other uses.

Let’s start with lookahead since it is very straightforward to implement. Lookahead is common in limiters and compressors because any non-zero attack/release times will cause the envelope to lag behind the audio due to the filtering, and as a result, it won’t attenuate the right part of the signal corresponding to the envelope. This can be fixed by delaying the output of the audio so that it lines up with the signal’s envelope. The amount we delay the audio by is the lookahead time, so an extra field is needed in the compressor:

public class Compressor : MonoBehaviour
{
    [AudioSlider("Threshold (dB)", -60f, 0f)]
    public float threshold = 0f;		// in dB
    [AudioSlider("Ratio (x:1)", 1f, 20f)]
    public float ratio = 1f;
    [AudioSlider("Knee", 0f, 1f)]
    public float knee = 0.2f;
    [AudioSlider("Pre-gain (dB)", -12f, 24f)]
    public float preGain = 0f;			// in dB, amplifies the audio signal prior to envelope detection.
    [AudioSlider("Post-gain (dB)", -12f, 24f)]
    public float postGain = 0f;			// in dB, amplifies the audio signal after compression.
    [AudioSlider("Attack time (ms)", 0f, 200f)]
    public float attackTime = 10f;		// in ms
    [AudioSlider("Release time (ms)", 10f, 3000f)]
    public float releaseTime = 50f;		// in ms
    [AudioSlider("Lookahead time (ms)", 0, 200f)]
    public float lookaheadTime = 0f;	// in ms

    public ProcessType processType = ProcessType.Compressor;
    public DetectionMode detectMode = DetectionMode.Peak;

    private EnvelopeDetector[] m_EnvelopeDetector;
    private Delay m_LookaheadDelay;

    private delegate float SlopeCalculation (float ratio);
    private SlopeCalculation m_SlopeFunc;

    // Continued...

I won’t actually go over implementing the delay itself since it is very straightforward (it’s just a simple circular buffer delay line). The one thing I will say is that if you want the lookahead time to be modifiable in real time, the circular buffer needs to be initialized to a maximum length allowed by the lookahead time (in my case 200ms), and then you need to keep track of the actual time/position in the buffer that will move based on the current delay time.

The delay comes after the envelope is extracted from the audio signal and before the compressor gain is applied:

void OnAudioFilterRead (float[] data, int numChannels)
{
    // Calculate pre-gain & extract envelope
    // ...

    // Delay the incoming signal for lookahead.
    if (lookaheadTime > 0f) {
        m_LookaheadDelay.SetDelayTime(lookaheadTime, sampleRate);
        m_LookaheadDelay.Process(data, numChannels);
    }

    // Apply compressor gain
    // ...
}

That’s all there is to the lookahead, so now we turn our attention to the last feature. The knee of the compressor is the area around the threshold where compression kicks in. This can either be a hard knee (the compressor kicks in abruptly as soon as the threshold is reached) or a soft knee (compression is more gradual around the threshold, known as the knee width). Comparing the two in a plot illustrates the difference clearly.

Hard knee in black and soft knee in light blue (threshold is -24 dB).

Hard knee in black and soft knee in light blue (threshold is -24 dB).

There are two common ways of specifying the knee width. One is an absolute value in dB, and the other is as a factor of the threshold as a value between 0 and 1. The latter is one that I’ve found to be most common, so it will be what I use. In the diagram above, for example, the threshold is -24 dB, so a knee value of 1.0 results in a knee width of 24 dB around the threshold. Like the lookahead feature, a new field will be required:

public class Compressor : MonoBehaviour
{
    [AudioSlider("Threshold (dB)", -60f, 0f)]
    public float threshold = 0f;		// in dB
    [AudioSlider("Ratio (x:1)", 1f, 20f)]
    public float ratio = 1f;
    [AudioSlider("Knee", 0f, 1f)]
    public float knee = 0.2f;
    [AudioSlider("Pre-gain (dB)", -12f, 24f)]
    public float preGain = 0f;			// in dB, amplifies the audio signal prior to envelope detection.
    [AudioSlider("Post-gain (dB)", -12f, 24f)]
    public float postGain = 0f;			// in dB, amplifies the audio signal after compression.
    [AudioSlider("Attack time (ms)", 0f, 200f)]
    public float attackTime = 10f;		// in ms
    [AudioSlider("Release time (ms)", 10f, 3000f)]
    public float releaseTime = 50f;		// in ms
    [AudioSlider("Lookahead time (ms)", 0, 200f)]
    public float lookaheadTime = 0f;	// in ms

    public ProcessType processType = ProcessType.Compressor;
    public DetectionMode detectMode = DetectionMode.Peak;

    private EnvelopeDetector[] m_EnvelopeDetector;
    private Delay m_LookaheadDelay;

    private delegate float SlopeCalculation (float ratio);
    private SlopeCalculation m_SlopeFunc;

    // Continued...

At the start of our process block (OnAudioFilterRead()), we set up for a possible soft knee compression:

float kneeWidth = threshold * knee * -1f; // Threshold is in dB and will always be either 0 or negative, so * by -1 to make positive.
float lowerKneeBound = threshold - (kneeWidth / 2f);
float upperKneeBound = threshold + (kneeWidth / 2f);

Still in the processing block, we calculate the compressor slope as normal according to the equation from part 2:

slope = 1 – (1 / ratio), for compression

slope = 1, for limiting

To calculate the actual soft knee, I will use linear interpolation. First I check if the knee width is > 0 for a soft knee. If it is, the slope value is scaled by the linear interpolation factor if the envelope value is within the knee bounds:

slope *= ((envValue – lowerKneeBound) / kneeWidth) * 0.5

The compressor gain is then determined using the same equation as before, except instead of calculating in relation to the threshold, we use the lower knee bound:

gain = slope * (lowerKneeBound – envValue)

The rest of the calculation is the same:

for (int i = 0, j = 0; i < data.Length; i+=numChannels, ++j) {
    envValue = AudioUtil.Amp2dB(envelopeData[0][j]);
    slope = m_SlopeFunc(ratio);

    if (kneeWidth > 0f && envValue > lowerKneeBound && envValue < upperKneeBound) { // Soft knee
        // Lerp the compressor slope value.
        // Slope is multiplied by 0.5 since the gain is calculated in relation to the lower knee bound for soft knee.
        // Otherwise, the interpolation's peak will be reached at the threshold instead of at the upper knee bound.
        slope *= ( ((envValue - lowerKneeBound) / kneeWidth) * 0.5f );
        gain = slope * (lowerKneeBound - envValue);
    } else { // Hard knee
        gain = slope * (threshold - envValue);
        gain = Mathf.Min(0f, gain);
    }

    gain = AudioUtil.dB2Amp(gain);

    for (int chan = 0; chan < numChannels; ++chan) {
        data[i+chan] *= (gain * postGainAmp);
    }
}

In order to verify that the soft knee is calculated correctly, it is best to plot the results. To do this I just created a helper method that calculates the compressor values for a range of input values from -90 dB to 0 dB. Here is the plot of a compressor with a threshold of -12.5 dB, a 4:1 ratio, and a knee of 0.4:

Compressor with a threshold of -12.5 dB, 4:1 ratio, and knee of 0.4.

Compressor with a threshold of -12.5 dB, 4:1 ratio, and knee of 0.4.

Of course this also works when the compressor is in limiter mode, which will result in a gentler application of the limiting effect.

Compressor in limiter mode with a threshold of -18 dB, and knee of 0.6.

Compressor in limiter mode with a threshold of -18 dB, and knee of 0.6.

That concludes this series on building a compressor/limiter component.

Dynamics Processing: Compressor/Limiter, part 2

In part 1 I detailed how I built the envelope detector that I will now use in my Unity compressor/limiter. To reiterate, the envelope detector extracts the amplitude contour of the audio that will be used by the compressor to determine when to compress the signal’s gain. The response of the compressor is determined by the attack time and the release time of the envelope, with higher values resulting in a smoother envelope, and hence, a gentler response in the compressor.

The compressor script is a MonoBehaviour component that can be attached to any GameObject. Here are the fields and corresponding inspector GUI:

public class Compressor : MonoBehaviour
{
    [AudioSlider("Threshold (dB)", -60f, 0f)]
    public float threshold = 0f;		// in dB
    [AudioSlider("Ratio (x:1)", 1f, 20f)]
    public float ratio = 1f;
    [AudioSlider("Knee", 0f, 1f)]
    public float knee = 0.2f;
    [AudioSlider("Pre-gain (dB)", -12f, 24f)]
    public float preGain = 0f;			// in dB, amplifies the audio signal prior to envelope detection.
    [AudioSlider("Post-gain (dB)", -12f, 24f)]
    public float postGain = 0f;			// in dB, amplifies the audio signal after compression.
    [AudioSlider("Attack time (ms)", 0f, 200f)]
    public float attackTime = 10f;		// in ms
    [AudioSlider("Release time (ms)", 10f, 3000f)]
    public float releaseTime = 50f;		// in ms
    [AudioSlider("Lookahead time (ms)", 0, 200f)]
    public float lookaheadTime = 0f;	// in ms

    public ProcessType processType = ProcessType.Compressor;
    public DetectionMode detectMode = DetectionMode.Peak;

    private EnvelopeDetector[] m_EnvelopeDetector;
    private Delay m_LookaheadDelay;

    private delegate float SlopeCalculation (float ratio);
    private SlopeCalculation m_SlopeFunc;
    
    // Continued...
Compressor/Limiter Unity inspector GUI.

Compressor/Limiter Unity inspector GUI.

Compressor/Limiter Unity inspector GUI.

Compressor/Limiter Unity inspector GUI.

 

 

 

 

 

 

 

 

The two most important parameters for a compressor are the threshold and the ratio values. When a signal exceeds the threshold, the compressor reduces the level of the signal by the given ratio. For example, if the threshold is -2 dB with a ratio of 4:1 and the compressor encounters a signal peak of +2 dB, the gain reduction will be 3 dB, resulting in the signal’s new level of -1dB. The ratio is just a percentage, so a 4:1 ratio means that the signal will be reduced by 75% (1 – 1/4 = 0.75). The difference between the threshold and the signal peak (which is 4 dB in this example) is scaled by the ratio to arrive at the 3 dB reduction (4 * 0.75 = 3). When the ratio is ∞:1, the compressor is turned into a limiter. The compressor’s output can be visualized by a plot of amplitude in vs. amplitude out:

Plot of amplitude in vs. amplitdue out of a compressor with 4:1 ratio.

Plot of amplitude in vs. amplitdue out of a compressor with 4:1 ratio.

When the ratio is ∞:1, the resulting amplitude after the threshold would be a straight horizontal line in the above plot, effectively preventing any levels from exceeding the threshold. It can easily be seen how this then would exhibit the behavior of a limiter. From these observations, we can derive the equations we need for the compressor.

compressor gain = slope * (threshold – envelope value) if envelope value >= threshold, otherwise 0

slope = 1 – (1 / ratio), or for limiting, slope = 1

All amplitude values are in dB for these equations. We saw both of these equations earlier in the example I gave, and both are pretty straightforward. These elements can now be combined to make up the compressor/limiter. The Awake method is called as soon as the component is initialized in the scene.

 

void Awake ()
{
    if (processType == ProcessType.Compressor) {
        m_SlopeFunc = CompressorSlope;
    } else if (processType == ProcessType.Limiter) {
        m_SlopeFunc = LimiterSlope;
    }

    // Convert from ms to s.
    attackTime /= 1000f;
    releaseTime /= 1000f;

    // Handle stereo max number of channels for now.
    m_EnvelopeDetector = new EnvelopeDetector[2];
    m_EnvelopeDetector[0] = new EnvelopeDetector(attackTime, releaseTime, detectMode, sampleRate);
    m_EnvelopeDetector[1] = new EnvelopeDetector(attackTime, releaseTime, detectMode, sampleRate);
}

Here is the full compressor/limiter code in Unity’s audio callback method. When placed on a component with the audio listener, the data array will contain the audio signal prior to being sent to the system’s output.

void OnAudioFilterRead (float[] data, int numChannels)
{
    float postGainAmp = AudioUtil.dB2Amp(postGain);

    if (preGain != 0f) {
        float preGainAmp = AudioUtil.dB2Amp(preGain);
        for (int k = 0; k < data.Length; ++k) {
            data[k] *= preGainAmp;
        }
    }

    float[][] envelopeData = new float[numChannels][];

    if (numChannels == 2) {
        float[][] channels;
        AudioUtil.DeinterleaveBuffer(data, out channels, numChannels);
        m_EnvelopeDetector[0].GetEnvelope(channels[0], out envelopeData[0]);
        m_EnvelopeDetector[1].GetEnvelope(channels[1], out envelopeData[1]);
        for (int n = 0; n < envelopeData[0].Length; ++n) {
            envelopeData[0][n] = Mathf.Max(envelopeData[0][n], envelopeData[1][n]);
        }
    } else if (numChannels == 1) {
        m_EnvelopeDetector[0].GetEnvelope(data, out envelopeData[0]);
    } else {
        // Error...
    }

    m_Slope = m_SlopeFunc(ratio);

    for (int i = 0, j = 0; i < data.Length; i+=numChannels, ++j) {
        m_Gain = m_Slope * (threshold - AudioUtil.Amp2dB(envelopeData[0][j]));
        m_Gain = Mathf.Min(0f, m_Gain);
        m_Gain = AudioUtil.dB2Amp(m_Gain);
        for (int chan = 0; chan < numChannels; ++chan) {
            data[i+chan] *= (m_Gain * postGainAmp);
        }
    }
}

And quickly, here is the helper method for deinterleaving a multichannel buffer:

public static void DeinterleaveBuffer (float[] source, out float[][] output, int sourceChannels)
{
    int channelLength = source.Length / sourceChannels;

    output = new float[sourceChannels][];

    for (int i = 0; i < sourceChannels; ++i) {
        output[i] = new float[channelLength];

        for (int j = 0; j < channelLength; ++j) {
            output[i][j] = source[j*sourceChannels+i];
        }
    }
}

First off, there are a few utility functions that I included in the component that converts between linear amplitude and dB values that we can see in the function above. Pre-gain is applied to the audio signal prior to extracting the envelope. For multichannel audio, Unity unfortunately gives us an interleaved buffer, so this needs to be deinterleaved before sending it to the envelope detector (recall that the detector uses a recursive filter and thus has state variables. This could of course be handled differently in the envelope detector, but it’s simpler to work on single continuous data buffers).

When working with multichannel audio, each channel will have a unique envelope. These could of course be processed separately, but this will result in the relative levels between the channels to be disturbed. Instead, I take the maximum envelope value and use that for the compressor. Another option would be to take the average of the two.

I then calculate the slope value based on whether the component is set to compressor or limiter mode (via a function delegate). The following loop is just realizing the equations posted earlier, and converting the dB gain value to linear amplitude before applying it to the audio signal along with post-gain.

This completes the compressor/limiter component. However, there are two important elements missing: soft knee processing, and lookahead. From the plot earlier in the post, we see that once the signal reaches the threshold, the compressor kicks in rather abruptly. This point is called the knee of the compressor, and if we want this transition to happen more gently, we can interpolate within a zone around the threshold.

It’s common, especially in limiters, to have a lookahead feature that compensates for the obvious lag of the envelope detector. In other words, when the attack and release times are non-zero, the resulting envelope lags behind the audio signal as a result of the filtering. The compressor/limiter will actually miss attenuating the peaks in the signal that it needs to because of this lag. That’s where lookahead comes in. In truth, it’s a bit of a misnomer because we can obviously not see into the future of an audio signal, but we can delay the audio to achieve the same effect. This means that we extract the envelope as normal, but delay the audio output so that the compressor gain value lines up with the audio peaks that it is meant to attenuate.

I will be implementing these two remaining features in a future post.

Dynamics processing: Compressor/Limiter, part 1

Lately I’ve been busy developing an audio-focused game in Unity, whose built-in audio engine is notorious for being extremely basic and lacking in features. (As of this writing, Unity 5 has not yet been released, in which its entire built-in audio engine is being overhauled). For this project I have created all the DSP effects myself as script components, whose behavior is driven by Unity’s coroutines. In order to have slightly more control over the final mix of these elements, it became clear that I needed a compressor/limiter. This particular post is written with Unity/C# in mind, but the theory and code is easy enough to adapt to other uses. In this first part we’ll be looking at writing the envelope detector, which is needed by the compressor to do its job.

An envelope detector (also called a follower) extracts the amplitude envelope from an audio signal based on three parameters: an attack time, release time, and detection mode. The attack/release times are fairly straightforward, simply defining how quickly the detection responds to rising and falling amplitudes. There are typically two modes of calculating the envelope of a signal: by its peak value or its root mean square value. A signal’s peak value is just the instantaneous sample value while the root mean square is measured over a series of samples, and gives a more accurate account of the signal’s power. The root mean square is calculated as:

rms = sqrt ( (1/n) * (x12 + x22 + … + xn2) ),

where n is the number of data values. In other words, we sum together the squares of all the sample values in the buffer, find the average by dividing by n, and then taking the square root. In audio processing, however, we normally bound the sample size (n) to some fixed number (called windowing). This effectively means that we calculate the RMS value over the past n samples.

(As an aside, multiplying by 1/n effectively assigns equal weights to all the terms, making it a rectangular window. Other window equations can be used instead which would favor terms in the middle of the window. This results in even greater accuracy of the RMS value since brand new samples (or old ones at the end of the window) have less influence over the signal’s power.)

Now that we’ve seen the two modes of detecting a signal’s envelope, we can move on to look at the role of the attack/release times. These values are used in calculating coefficients for a first-order recursive filter (also called a leaky integrator) that processes the values we get from the audio buffer (through one of the two detection methods). Simply stated, we get the sample values from the audio signal and pass them through a low-pass filter to smooth out the envelope.

We calculate the coefficients using the time-constant equation:

g = e ^ ( -1 / (time * sample rate) ),

where time is in seconds, and sample rate in Hz. Once we have our gain coefficients for attack/release, we put them into our leaky integrator equation:

out = in + g * (out – in),

where in is the input sample we detected from the incoming audio, g is either the attack or release gain, and out is the envelope sample value. Here it is in code:

public void GetEnvelope (float[] audioData, out float[] envelope)
{
    envelope = new float[audioData.Length];

    m_Detector.Buffer = audioData;

    for (int i = 0; i < audioData.Length; ++i) {
        float envIn = m_Detector[i];

        if (m_EnvelopeSample < envIn) {
            m_EnvelopeSample = envIn + m_AttackGain * (m_EnvelopeSample - envIn);
        } else {
            m_EnvelopeSample = envIn + m_ReleaseGain * (m_EnvelopeSample - envIn);
        }

        envelope[i] = m_EnvelopeSample;
    }
}

(Source: code is based on “Envelope detector” from http://www.musicdsp.org/archive.php?classid=2#97, with detection modes added by me.)

The envelope sample is calculated based on whether the current audio sample is rising or falling, with the envIn sample resulting from one of the two detection modes. This is implemented similarly to what is known as a functor in C++. I prefer this method to having another branching structure inside the loop because among other things, it’s more extensible and results in cleaner code (as well as being modular). It could be implemented using delegates/function pointers, but the advantage of a functor is that it retains its own state, which is useful for the RMS calculation as we will see. Here is how the interface and classes are declared for the detection modes:

public interface IEnvelopeDetection
{
    float[] Buffer { set; get; }
    float this [int index] { get; }

    void Reset ();
}

We then have two classes that implement this interface, one for each mode:

A signal’s peak value is the instantaneous sample value while the root mean square is measured over a series of samples, and gives a more accurate account of the signal’s power.

public class DetectPeak : IEnvelopeDetection
{
    private float[] m_Buffer;

    /// <summary>
    /// Sets the buffer to extract envelope data from. The original buffer data is held by reference (not copied).
    /// </summary>
    public float[] Buffer
    {
        set { m_Buffer = value; }
        get { return m_Buffer; }
    }

    /// <summary>
    /// Returns the envelope data at the specified position in the buffer.
    /// </summary>
    public float this [int index]
    {
        get { return Mathf.Abs(m_Buffer[index]); }
    }

    public DetectPeak () {}
    public void Reset () {}
}

This particular class involves a rather trivial operation of just returning the absolute value of a signal’s sample. The RMS detection class is more involved.

/// <summary>
/// Calculates and returns the root mean square value of the buffer. A circular buffer is used to simplify the calculation, which avoids
/// the need to sum up all the terms in the window each time.
/// </summary>
public float this [int index]
{
    get {
        float sampleSquared = m_Buffer[index] * m_Buffer[index];
        float total = 0f;
        float rmsValue;

        if (m_Iter < m_RmsWindow.Length-1) {
            total = m_LastTotal + sampleSquared;
            rmsValue = Mathf.Sqrt((1f / (index+1)) * total);
        } else {
            total = m_LastTotal + sampleSquared - m_RmsWindow.Read();
            rmsValue = Mathf.Sqrt((1f / m_RmsWindow.Length) * total);
        }

        m_RmsWindow.Write(sampleSquared);
        m_LastTotal = total;
        m_Iter++;

        return rmsValue;
    }
}

public DetectRms ()
{
    m_Iter = 0;
    m_LastTotal = 0f;
    // Set a window length to an arbitrary 128 for now.
    m_RmsWindow = new RingBuffer<float>(128);
}

public void Reset ()
{
    m_Iter = 0;
    m_LastTotal = 0f;
    m_RmsWindow.Clear(0f);
}

The RMS calculation in this class is an optimization of the general equation I stated earlier. Instead of continually summing together all the  values in the window for each new sample, a ring buffer is used to save each new term. Since there is only ever 1 new term to include in the calculation, it can be simplified by storing all the squared sample values in the ring buffer and using it to subtract from our previous total. We are just left with a multiply and square root, instead of having to redundantly add together 128 terms (or however big n is). An iterator variable ensures that the state of the detector remains consistent across successive audio blocks.

In the envelope detector class, the detection mode is selected by assigning the corresponding class to the ivar:

public class EnvelopeDetector
{
    protected float m_AttackTime;
    protected float m_ReleaseTime;
    protected float m_AttackGain;
    protected float m_ReleaseGain;
    protected float m_SampleRate;
    protected float m_EnvelopeSample;

    protected DetectionMode m_DetectMode;
    protected IEnvelopeDetection m_Detector;

    // Continued...
public DetectionMode DetectMode
{
    get { return m_DetectMode; }
    set {
        switch(m_DetectMode) {
            case DetectionMode.Peak:
                m_Detector = new DetectPeak();
                break;

            case DetectionMode.Rms:
                m_Detector = new DetectRms();
                break;
        }
    }
}

Now that we’ve looked at extracting the envelope from an audio signal, we will look at using it to create a compressor/limiter component to be used in Unity. That will be upcoming in part 2.

Beat Synchronization in Unity

Update

Due to some valuable advice (courtesy of Tazman-audio), I’ve made a few small changes that ensure that synchronization stays independent of framerate. My original strategy for handling this issue was to grab the current sample of the audio source’s playback and compare that to the next expected beat’s sample value (discussed in more detail below). Although this was working fine, Unity’s documentation makes little mention as to the accuracy of this value, aside from it being more preferrable than using Time.time.  Furthermore, the initial synch with the start of audio playback and the BeatCheck function would suffer from some, albeit very small, discrepancy.

Here is the change to the Start method in the “BeatSynchronizer” script that enforces synching with the start of the audio:

public float bpm = 120f; // Tempo in beats per minute of the audio clip.
public float startDelay = 1f; // Number of seconds to delay the start of audio playback.
public delegate void AudioStartAction(double syncTime);
public static event AudioStartAction OnAudioStart;

void Start ()
{
    double initTime = AudioSettings.dspTime;
    audio.PlayScheduled(initTime + startDelay);
    if (OnAudioStart != null) {
        OnAudioStart(initTime + startDelay);
    }
}

The PlayScheduled method starts the audio clip’s playback at the absolute time (on the audio system’s dsp timeline) given in the function argument. The correct start time is then this initial value plus the given delay. This same value is then broadcast to all the beat counters that have subscribed to the AudioStartAction event, which ensures their alignment with the audio.

This necessitated a small change to the BeatCheck method as well, as can be seen below.  The current sample is now calculated using the audio system’s dsp time instead of the clip’s sample position, which also aleviated the need for wrapping the current sample position when the audio clip loops.

IEnumerator BeatCheck ()
{
    while (audioSource.isPlaying) {
        currentSample = (float)AudioSettings.dspTime * audioSource.clip.frequency;

        if (currentSample >= (nextBeatSample + sampleOffset)) {
            foreach (GameObject obj in observers) {
                obj.GetComponent<BeatObserver>().BeatNotify(beatType);
            }
            nextBeatSample += samplePeriod;
        }

        yield return new WaitForSeconds(loopTime / 1000f);
    }
}

Lastly, I decided to add a nice feature to the beat synchronizer that allows you to scale up the the beat values by an integer constant. This is very useful for cases where you might want to synch to beats that transcend one measure. For example, you could synchronize to the downbeat of the second measure of a four-measure group by selecting the following values in the inspector:

Scaling up the beat values by a factor of 4 treats each beat as a measure instead of a single beat (assuming 4/4 time).

Scaling up the beat values by a factor of 4 treats each beat as a measure instead of a single beat (assuming 4/4 time).

This same feature exists for the pattern counter as well, allowing great deal of flexibility and control over what you can synchronize to.  There is a new example scene in the project demonstrating this.

Github project here.

I did, however, come across a possible bug in the PlayScheduled function: a short burst of noise can be heard occasionally when running a scene. I’ve encountered this both in the Unity editor (version 4.3.3) and in the build product. This does not happen when starting the audio using Play or checking “Play On Awake”.

Original Post

Lately I’ve been experimenting and brainstorming different ways in which audio can be tied in with gameplay, or even drive gameplay to some extent. This is quite challenging because audio/music is so abstract, but rhythm is one element that has been successfully incorporated into gameplay for some time.  To experiment with this in Unity, I wrote a set of scripts that handle beat synchronization to an audio clip.  The github project can be found here.

The way I set this up to work is by comparing the current sample of the audio data to the sample of the next expected beat to occur.  Another approach would be to compare the time values, but this is less accurate and less flexible.  Sample accuracy ensures that the game logic follows the actual audio data, and avoids the issues of framerate drops that can affect the time values.

The following script handles the synchronization of all the beat counters to the start of audio playback:

public float bpm = 120f; // Tempo in beats per minute of the audio clip.
public float startDelay = 1f; // Number of seconds to delay the start of audio playback.
public delegate void AudioStartAction(double syncTime);
public static event AudioStartAction OnAudioStart;

void Start ()
{
    StartCoroutine(StartAudio());
}

IEnumerator StartAudio ()
{
    yield return new WaitForSeconds(startDelay);

    audio.Play();

    if (OnAudioStart != null) {
        OnAudioStart();
    }
}

To accomplish this, each beat counter instance adds itself to the event OnAudioStart, seen here in the “BeatCounter” script:

void OnEnable ()
{
    BeatSynchronizer.OnAudioStart += () => { StartCoroutine(BeatCheck()); };
}

When OnAudioStart is called above, all beat counters that have subscribed to this event are invoked, and in this case, starts the coroutine BeatCheck that contains most of the logic and processing of determining when beats occur. (The () => {} statement is C#’s lambda syntax).

The BeatCheck coroutine runs at a specific frequency given by loopTime, instead of running each frame in the game loop. For example, if a high degree of accuracy isn’t required, this can save on the CPU load by setting the coroutine to run every 40 or 50 milliseconds instead of the 10 – 15 milliseconds that it may take for each frame to execute in the game loop.  However, since the coroutine yields to WaitForSeconds (see below), setting the loop time to 0 will effectively cause the coroutine to run as frequently as the game loop since execution of the coroutine in this case happens right after Unity’s Update method.

IEnumerator BeatCheck ()
{
    while (audioSource.isPlaying) {
        currentSample = audioSource.timeSamples;

        // Reset next beat sample when audio clip wraps.
        if (currentSample < previousSample) {
            nextBeatSample = 0f;
        }

        if (currentSample >= (nextBeatSample + sampleOffset)) {
            foreach (GameObject obj in observers) {
                obj.GetComponent<BeatObserver>().BeatNotify(beatType);
            }
            nextBeatSample += samplePeriod;
        }
        
        previousSample = currentSample;

        yield return new WaitForSeconds(loopTime / 1000f);
    }
}

Furthermore, the fields that count the sample positions and next sample positions are declared as floats, which may seem wrong at first since there is no possibility of fractional samples.  However, the sample period (the number of samples between each beat in the audio) is calculated from the BPM of the audio and the note value of the beat to check, so it is likely to result in a floating point value. In other words:

samplePeriod = (60 / (bpm * beatValue)) * sampleRate

where beatValue is a constant that defines the ratio of the beat to a quarter note.  For instance, for an eighth beat, beatValue = 2 since there are two eighths in a quarter.  For a dotted quarter beat, beatValue = 1 / 1.5; the ratio of one quarter to a dotted quarter.

If samplePeriod is truncated to an int, drift would occur due to loss of precision when comparing the sample values, especially for longer clips of music.

When it is determined that a beat has occurred in the audio, the script notifies its observers along with the type of beat that triggered the event (the beat type is a user-defined value that allows different action to be taken depending on the beat type).  The observers (any Unity object) are easily added through the scripts inspector panel:

The beat counter's inspector panel.

The beat counter’s inspector panel.

Each object observing a beat counter also contains a beat observer script that serves two functions: it allows control over the tolerance/sensitivity of the beat event, and sets the corresponding bit in a bit mask for what beat just occurred that the user can poll for in the object’s script and take appropriate action.

public void BeatNotify (BeatType beatType)
{
    beatMask |= beatType;
    StartCoroutine(WaitOnBeat(beatType));
}

IEnumerator WaitOnBeat (BeatType beatType)
{
    yield return new WaitForSeconds(beatWindow / 1000f);
    beatMask ^= beatType;
}

To illustrate how a game object might respond to and take action when a beat occurs, the following script activates an animation trigger on the down-beat and rotates the object during an up-beat by 45 degrees:

void Update ()
{
    if ((beatObserver.beatMask & BeatType.DownBeat) == BeatType.DownBeat) {
        anim.SetTrigger("DownBeatTrigger");
    }
    if ((beatObserver.beatMask & BeatType.UpBeat) == BeatType.UpBeat) {
        transform.Rotate(Vector3.forward, 45f);
    }
}

Finally, here is a short video demonstrating the example scene set up in the project:

Beat Synchronization in Unity Demo from Christian on Vimeo.

AdVerb: Building a Reverb Plug-In Using Modulating Comb Filters

Some time ago, I began exploring the early reverb algorithms of Schroeder and Moorer, whose work dates back all the way to the 1960s and 70s respectively.  Still their designs and theories inform the making of algorithmic reverbs today.  Recently I took it upon myself to continue experimenting with the Moorer design I left off with in an earlier post.  This resulted in the complete reverb plug-in “AdVerb”, which is available for free in downloads.  Let me share what went into designing and implementing this effect.

One of the foremost challenges in basing a reverb design on Schroeder or Moorer is that it tends to sound a little metallic because with the number of comb filters suggested, the echo density doesn’t build up fast or dense enough.  The all-pass filters in series that come after the comb filter section helps to diffuse the reverb tail, but I found that the delaying all-pass filters added a little metallic sound of their own.  One obvious way of overcoming this is to add more comb filters (today’s computers can certainly handle it).  More importantly, however, the delay times of the comb filters need to be mutually prime so that their frequency responses don’t overlap, which would result in increased beating in the reverb tail.

To arrive at my values for the 8 comb filters I’m using, I wrote a simple little script that calculated the greatest common divisor between all the delay times I chose and made sure that the results were 1.  This required a little bit of tweaking in the numbers, as you can imagine finding 8 coprimes is not as easy as it sounds, especially when trying to keep the range minimal between them.  It’s not as important for the two all-pass filters to be mutually prime because they are in series, not in parallel like the comb filters.

I also discovered, after a number of tests, that the tap delay used to generate the early reflections (based on Moorer’s design) was causing some problems in my sound.  I’m still a bit unsure as to why, though it could be poorly chosen tap delay times or something to do with mixing, but it was enough so that I decided to discard the tap delay network and just focus on comb filters and all-pass filters.  It was then that I took an idea from Dattorro and Frenette who both showed how the use of modulated comb/all-pass filters can help smear the echo density and add warmth to the reverb.  Dattorro is responsible for the well-known plate reverbs that use modulating all-pass filters in series.

The idea behind a modulated delay line is that some oscillator (usually a low-frequency sine wave) modulates the delay value according to a frequency rate and amplitude.  This is actually the basis for chorusing and flanging effects.  In a reverb, however, the values need to be kept very small so that the chorusing effect will not be evident.

I had fun experimenting with these modulated delay lines, and so I eventually decided to modulate one of the all-pass filters as well and give control of it to the user, which offers a great deal more fun and crazy ways to use this plug-in.  Let’s take a look at the modulated all-pass filter (the modulated comb filter is very similar).  We already know what an all-pass filter looks like, so here’s just the modulated delay line:

Modulated all-pass filter.

Modulated all-pass filter.

The oscillator modulates the value currently in the delay line that we then use to interpolate, resulting in the actual value.  In code it looks like this:

double offset, read_offset, fraction, next;
size_t read_pos;

offset = (delay_length / 2.) * (1. + sin(phase) * depth);
phase += phase_incr;
if (phase > TWO_PI) phase -= TWO_PI;
if (offset > delay_length) offset = delay_length;

read_offset = ((size_t)delay_buffer->p - (size_t)delay_buffer->p_head) / sizeof(double) - offset;
if (read_offset < 0) {
    read_offset = read_offset + delay_length;
} else if (read_offset > delay_length) {
    read_offset = read_offset - delay_length;
}

read_pos = (size_t)read_offset;
fraction = read_offset - read_pos;
if (read_pos != delay_length - 1) {
    next = *(delay_buffer->p_head + read_pos + 1);
} else {
    next = *delay_buffer->p_head;
}

return *(delay_buffer->p_head + read_pos) + fraction * (next - *(delay_buffer->p_head + read_pos));

In case that looks a little daunting, we’ll step through the C code (apologies for the pointer arithmetic!).  At the top we calculate the offset using the delay length in samples as our base point.  The following lines are easily seen as incrementing and wrapping the phase of the oscillator as well as capping the offset to the delay length.

The next line calculates the current position in the buffer from the current position pointer, p, and the buffer head, p_head.  This is accomplished by casting the pointer addresses to integral values and dividing by the size of the data type of each buffer element.  The read_offset position will determine where in the delay buffer we read from, so it needs to be clamped to the buffer’s length as well.

The rest is simply linear interpolation (albeit with some pointer arithmetic: delay_buffer->p_head + read_pos + 1 is equivalent to delay_buffer[read_pos + 1]).  Once we have our modulated delay value, we can finish processing the all-pass filter:

delay_val = get_modulated_delay_value(allpass_filter);

// don't write the modulated delay_val into the buffer, only use it for the output sample
*delay_buffer->p = sample_in + (*delay_buffer->p * allpass_filter->g);
sample_out = delay_val - (allpass_filter->g * sample_in);

The final topology of the reverb is given below:

Topology of the AdVerb plug-in.

Topology of the AdVerb plug-in.

The pre-delay is implemented by a simple delay line, and the low-pass filters are of the one-pole IIR variety.  Putting the LPFs inside the comb filters’ feedback loops simulates the absorption of energy that sound undergoes as it comes in contact with surfaces and travels through air.  This factor can be controlled with a damping parameter in the plug-in.

The one-pole moving-average filter is there for an extra bit of high frequency roll-off, and I chose it because this particular filter is an FIR type and has linear phase so it won’t add further disturbance to the modulated samples entering it.  The last (normal) all-pass filter in the series serves to add extra diffusion to the reverb tail.

Here are some short sound samples using a selection of presets included in the plug-in:

Piano, “Medium Room” preset

The preceding sample demonstrates a normal reverb setting.  Following are a few samples that demonstrate a couple of subtle and not-so-subtle effects:

Piano, “Make it Vintage” preset

Piano, “Bad Grammar” preset

Flute, “Shimmering Tail” preset

Feel free to get in touch regarding any questions or comments on “AdVerb“.