Tag Archives: low-pass

Dynamics processing: Compressor/Limiter, part 1

Lately I’ve been busy developing an audio-focused game in Unity, whose built-in audio engine is notorious for being extremely basic and lacking in features. (As of this writing, Unity 5 has not yet been released, in which its entire built-in audio engine is being overhauled). For this project I have created all the DSP effects myself as script components, whose behavior is driven by Unity’s coroutines. In order to have slightly more control over the final mix of these elements, it became clear that I needed a compressor/limiter. This particular post is written with Unity/C# in mind, but the theory and code is easy enough to adapt to other uses. In this first part we’ll be looking at writing the envelope detector, which is needed by the compressor to do its job.

An envelope detector (also called a follower) extracts the amplitude envelope from an audio signal based on three parameters: an attack time, release time, and detection mode. The attack/release times are fairly straightforward, simply defining how quickly the detection responds to rising and falling amplitudes. There are typically two modes of calculating the envelope of a signal: by its peak value or its root mean square value. A signal’s peak value is just the instantaneous sample value while the root mean square is measured over a series of samples, and gives a more accurate account of the signal’s power. The root mean square is calculated as:

rms = sqrt ( (1/n) * (x12 + x22 + … + xn2) ),

where n is the number of data values. In other words, we sum together the squares of all the sample values in the buffer, find the average by dividing by n, and then taking the square root. In audio processing, however, we normally bound the sample size (n) to some fixed number (called windowing). This effectively means that we calculate the RMS value over the past n samples.

(As an aside, multiplying by 1/n effectively assigns equal weights to all the terms, making it a rectangular window. Other window equations can be used instead which would favor terms in the middle of the window. This results in even greater accuracy of the RMS value since brand new samples (or old ones at the end of the window) have less influence over the signal’s power.)

Now that we’ve seen the two modes of detecting a signal’s envelope, we can move on to look at the role of the attack/release times. These values are used in calculating coefficients for a first-order recursive filter (also called a leaky integrator) that processes the values we get from the audio buffer (through one of the two detection methods). Simply stated, we get the sample values from the audio signal and pass them through a low-pass filter to smooth out the envelope.

We calculate the coefficients using the time-constant equation:

g = e ^ ( -1 / (time * sample rate) ),

where time is in seconds, and sample rate in Hz. Once we have our gain coefficients for attack/release, we put them into our leaky integrator equation:

out = in + g * (out – in),

where in is the input sample we detected from the incoming audio, g is either the attack or release gain, and out is the envelope sample value. Here it is in code:

public void GetEnvelope (float[] audioData, out float[] envelope)
    envelope = new float[audioData.Length];

    m_Detector.Buffer = audioData;

    for (int i = 0; i < audioData.Length; ++i) {
        float envIn = m_Detector[i];

        if (m_EnvelopeSample < envIn) {
            m_EnvelopeSample = envIn + m_AttackGain * (m_EnvelopeSample - envIn);
        } else {
            m_EnvelopeSample = envIn + m_ReleaseGain * (m_EnvelopeSample - envIn);

        envelope[i] = m_EnvelopeSample;

(Source: code is based on “Envelope detector” from http://www.musicdsp.org/archive.php?classid=2#97, with detection modes added by me.)

The envelope sample is calculated based on whether the current audio sample is rising or falling, with the envIn sample resulting from one of the two detection modes. This is implemented similarly to what is known as a functor in C++. I prefer this method to having another branching structure inside the loop because among other things, it’s more extensible and results in cleaner code (as well as being modular). It could be implemented using delegates/function pointers, but the advantage of a functor is that it retains its own state, which is useful for the RMS calculation as we will see. Here is how the interface and classes are declared for the detection modes:

public interface IEnvelopeDetection
    float[] Buffer { set; get; }
    float this [int index] { get; }

    void Reset ();

We then have two classes that implement this interface, one for each mode:

A signal’s peak value is the instantaneous sample value while the root mean square is measured over a series of samples, and gives a more accurate account of the signal’s power.

public class DetectPeak : IEnvelopeDetection
    private float[] m_Buffer;

    /// <summary>
    /// Sets the buffer to extract envelope data from. The original buffer data is held by reference (not copied).
    /// </summary>
    public float[] Buffer
        set { m_Buffer = value; }
        get { return m_Buffer; }

    /// <summary>
    /// Returns the envelope data at the specified position in the buffer.
    /// </summary>
    public float this [int index]
        get { return Mathf.Abs(m_Buffer[index]); }

    public DetectPeak () {}
    public void Reset () {}

This particular class involves a rather trivial operation of just returning the absolute value of a signal’s sample. The RMS detection class is more involved.

/// <summary>
/// Calculates and returns the root mean square value of the buffer. A circular buffer is used to simplify the calculation, which avoids
/// the need to sum up all the terms in the window each time.
/// </summary>
public float this [int index]
    get {
        float sampleSquared = m_Buffer[index] * m_Buffer[index];
        float total = 0f;
        float rmsValue;

        if (m_Iter < m_RmsWindow.Length-1) {
            total = m_LastTotal + sampleSquared;
            rmsValue = Mathf.Sqrt((1f / (index+1)) * total);
        } else {
            total = m_LastTotal + sampleSquared - m_RmsWindow.Read();
            rmsValue = Mathf.Sqrt((1f / m_RmsWindow.Length) * total);

        m_LastTotal = total;

        return rmsValue;

public DetectRms ()
    m_Iter = 0;
    m_LastTotal = 0f;
    // Set a window length to an arbitrary 128 for now.
    m_RmsWindow = new RingBuffer<float>(128);

public void Reset ()
    m_Iter = 0;
    m_LastTotal = 0f;

The RMS calculation in this class is an optimization of the general equation I stated earlier. Instead of continually summing together all the  values in the window for each new sample, a ring buffer is used to save each new term. Since there is only ever 1 new term to include in the calculation, it can be simplified by storing all the squared sample values in the ring buffer and using it to subtract from our previous total. We are just left with a multiply and square root, instead of having to redundantly add together 128 terms (or however big n is). An iterator variable ensures that the state of the detector remains consistent across successive audio blocks.

In the envelope detector class, the detection mode is selected by assigning the corresponding class to the ivar:

public class EnvelopeDetector
    protected float m_AttackTime;
    protected float m_ReleaseTime;
    protected float m_AttackGain;
    protected float m_ReleaseGain;
    protected float m_SampleRate;
    protected float m_EnvelopeSample;

    protected DetectionMode m_DetectMode;
    protected IEnvelopeDetection m_Detector;

    // Continued...
public DetectionMode DetectMode
    get { return m_DetectMode; }
    set {
        switch(m_DetectMode) {
            case DetectionMode.Peak:
                m_Detector = new DetectPeak();

            case DetectionMode.Rms:
                m_Detector = new DetectRms();

Now that we’ve looked at extracting the envelope from an audio signal, we will look at using it to create a compressor/limiter component to be used in Unity. That will be upcoming in part 2.

AdVerb: Building a Reverb Plug-In Using Modulating Comb Filters

Some time ago, I began exploring the early reverb algorithms of Schroeder and Moorer, whose work dates back all the way to the 1960s and 70s respectively.  Still their designs and theories inform the making of algorithmic reverbs today.  Recently I took it upon myself to continue experimenting with the Moorer design I left off with in an earlier post.  This resulted in the complete reverb plug-in “AdVerb”, which is available for free in downloads.  Let me share what went into designing and implementing this effect.

One of the foremost challenges in basing a reverb design on Schroeder or Moorer is that it tends to sound a little metallic because with the number of comb filters suggested, the echo density doesn’t build up fast or dense enough.  The all-pass filters in series that come after the comb filter section helps to diffuse the reverb tail, but I found that the delaying all-pass filters added a little metallic sound of their own.  One obvious way of overcoming this is to add more comb filters (today’s computers can certainly handle it).  More importantly, however, the delay times of the comb filters need to be mutually prime so that their frequency responses don’t overlap, which would result in increased beating in the reverb tail.

To arrive at my values for the 8 comb filters I’m using, I wrote a simple little script that calculated the greatest common divisor between all the delay times I chose and made sure that the results were 1.  This required a little bit of tweaking in the numbers, as you can imagine finding 8 coprimes is not as easy as it sounds, especially when trying to keep the range minimal between them.  It’s not as important for the two all-pass filters to be mutually prime because they are in series, not in parallel like the comb filters.

I also discovered, after a number of tests, that the tap delay used to generate the early reflections (based on Moorer’s design) was causing some problems in my sound.  I’m still a bit unsure as to why, though it could be poorly chosen tap delay times or something to do with mixing, but it was enough so that I decided to discard the tap delay network and just focus on comb filters and all-pass filters.  It was then that I took an idea from Dattorro and Frenette who both showed how the use of modulated comb/all-pass filters can help smear the echo density and add warmth to the reverb.  Dattorro is responsible for the well-known plate reverbs that use modulating all-pass filters in series.

The idea behind a modulated delay line is that some oscillator (usually a low-frequency sine wave) modulates the delay value according to a frequency rate and amplitude.  This is actually the basis for chorusing and flanging effects.  In a reverb, however, the values need to be kept very small so that the chorusing effect will not be evident.

I had fun experimenting with these modulated delay lines, and so I eventually decided to modulate one of the all-pass filters as well and give control of it to the user, which offers a great deal more fun and crazy ways to use this plug-in.  Let’s take a look at the modulated all-pass filter (the modulated comb filter is very similar).  We already know what an all-pass filter looks like, so here’s just the modulated delay line:

Modulated all-pass filter.

Modulated all-pass filter.

The oscillator modulates the value currently in the delay line that we then use to interpolate, resulting in the actual value.  In code it looks like this:

double offset, read_offset, fraction, next;
size_t read_pos;

offset = (delay_length / 2.) * (1. + sin(phase) * depth);
phase += phase_incr;
if (phase > TWO_PI) phase -= TWO_PI;
if (offset > delay_length) offset = delay_length;

read_offset = ((size_t)delay_buffer->p - (size_t)delay_buffer->p_head) / sizeof(double) - offset;
if (read_offset < 0) {
    read_offset = read_offset + delay_length;
} else if (read_offset > delay_length) {
    read_offset = read_offset - delay_length;

read_pos = (size_t)read_offset;
fraction = read_offset - read_pos;
if (read_pos != delay_length - 1) {
    next = *(delay_buffer->p_head + read_pos + 1);
} else {
    next = *delay_buffer->p_head;

return *(delay_buffer->p_head + read_pos) + fraction * (next - *(delay_buffer->p_head + read_pos));

In case that looks a little daunting, we’ll step through the C code (apologies for the pointer arithmetic!).  At the top we calculate the offset using the delay length in samples as our base point.  The following lines are easily seen as incrementing and wrapping the phase of the oscillator as well as capping the offset to the delay length.

The next line calculates the current position in the buffer from the current position pointer, p, and the buffer head, p_head.  This is accomplished by casting the pointer addresses to integral values and dividing by the size of the data type of each buffer element.  The read_offset position will determine where in the delay buffer we read from, so it needs to be clamped to the buffer’s length as well.

The rest is simply linear interpolation (albeit with some pointer arithmetic: delay_buffer->p_head + read_pos + 1 is equivalent to delay_buffer[read_pos + 1]).  Once we have our modulated delay value, we can finish processing the all-pass filter:

delay_val = get_modulated_delay_value(allpass_filter);

// don't write the modulated delay_val into the buffer, only use it for the output sample
*delay_buffer->p = sample_in + (*delay_buffer->p * allpass_filter->g);
sample_out = delay_val - (allpass_filter->g * sample_in);

The final topology of the reverb is given below:

Topology of the AdVerb plug-in.

Topology of the AdVerb plug-in.

The pre-delay is implemented by a simple delay line, and the low-pass filters are of the one-pole IIR variety.  Putting the LPFs inside the comb filters’ feedback loops simulates the absorption of energy that sound undergoes as it comes in contact with surfaces and travels through air.  This factor can be controlled with a damping parameter in the plug-in.

The one-pole moving-average filter is there for an extra bit of high frequency roll-off, and I chose it because this particular filter is an FIR type and has linear phase so it won’t add further disturbance to the modulated samples entering it.  The last (normal) all-pass filter in the series serves to add extra diffusion to the reverb tail.

Here are some short sound samples using a selection of presets included in the plug-in:

Piano, “Medium Room” preset

The preceding sample demonstrates a normal reverb setting.  Following are a few samples that demonstrate a couple of subtle and not-so-subtle effects:

Piano, “Make it Vintage” preset

Piano, “Bad Grammar” preset

Flute, “Shimmering Tail” preset

Feel free to get in touch regarding any questions or comments on “AdVerb“.

Algorithmic Reverbs: The Moorer Design

And we’re back to talk about reverberation.  Previously I introduced the Schroeder reverb design that used four comb filters in parallel that then fed two all-pass filters in series.  This signal would then be mixed with the original dry audio to produce the output.  This design was one of the very first in the digital domain, yet still provides the foundation for much of the algorithmic reverbs used today.  James A. Moorer was one of the first to expand and improve upon Schroeder’s design in the late seventies and was able to implement some of the suggestions and theories put forth by Schroeder that would enhance digital reverb.

One of these was the use of a tapped delay line to simulate early reflections, which are of crucial importance in the perception of acoustic space, moreso than the late reflections. This tapped delay line that forms the basis of the early reflections can contain delay times and a gain structure that could be modelled on a measured acoustic space, like a concert hall for instance.  In fact, Moorer did just that, and in his article “About This Reverberation Business” in the Computer Music Journal, he offers up a 19-tap delay line that was taken from a geometric simulation of the Boston Symphony Hall.  Here are those values put into an array for my implementation (I omit the first tap because it has a delay time of 0 with a gain of 1, which is just the original signal):

Tap delay time and gain values along with values for the comb filters and the LP filters

Another improvement Moorer made to his design was to include a simple first-order low-pass filter in the feedback loop of the six comb filters to simulate the absorption effects of air.  He goes on to talk about the intensity of sound and its relation to atmospheric conditions as it travels through it such as humidity, temperature, the frequency of the sound, and distance from the source.  The values I came up with for the low-pass filters are at this point experimental, though at this stage they seem to work well.  I’m not sure at this point exactly how to approximate the cutoff frequencies of these filters based on the data Moorer presented about the loss of energy that happens with sound as it travels, so more research will be needed in this area.  However, I’m also fine with deriving my own values and adjusting them to fit my needs of an acceptable sound.

We may recall previously the simple algorithm that implements a comb filter, and now with a low-pass filter in the the loop, it looks like this:

Comb Filter with a first-order IIR low-pass filter in the feedback loop

A little more experimentation can be done here too in placing the low-pass filter in an optimal position in the loop.  Here I am calculating the LP filter after the feedback gain is applied, though I’ve seen it being applied to the original signal prior to it entering the feedback loop as well.  Placing the LP filter in a good spot could potentially open up the possibility of controlling the brightness of the late reflections of the reverb in a meaningful way.

We now have a fairly complete picture of the Moorer design, illustrated below.

The Moorer Reverb Design

The last little detail has to do with the delay line in the late reflections network.  This ensures that the late reflections arrive at the output just a little after the early reflections. With a multitude of values, from delay lengths and gains, to how to mix all these elements together, it’s clear that reverb design is a combination of both science and art, and why it remains as one of the foremost challenges in DSP.

Now it follows that we do some listening, so here are some audio samples of the Moorer Reverberator.  The values used are for the most part Moorer’s own, but as was discussed earlier, the frequency cutoffs of the LP filters are my own, as is the delay time of the delay line in the late reflections network.  As an extension of this I have been tweaking the values proposed by Moorer as well as looking into other ways to modify this design to perhaps come up with my own reverb unit, but I’m sticking pretty close to Moorer’s design for this little show-and-tell.

Guitar strum with 1.4 second delay time at 27% wet mix

Guitar strum with 2.4 second delay time at 40% wet mix

Original guitar strum recording

The effects of the LP filter is quite noticeable in comparison to the Schroeder reverberation applied to the same audio file in that particular blog posting.  The overall effect on this soundfile is fairly subtle, however this is not necessarily a bad thing as it adds just a little sense of acoustic space to the sound.  The good thing about using this soundfile to test on is the long decay.  It is often here that we can hear the faults in a digital reverberator because the decay is otherwise masked in the more dense and active sections of audio.  We need to be careful to avoid “pumping” sounds or “puffing” in the decay tail of a reverb, and this is sometimes the fault of the all-pass filter as noted by Moorer.  The benefit of using this in the late reverberation network is to diffuse the late echoes, but it’s effect on the phase of the signal can be disruptive if the values for delay time and gain are not carefully chosen.  Moorer suggests a value of 6ms for delay time at a gain value of around 0.7.

Piano riff with 1.6 delay time at 24% wet mix

Piano riff with 3.6 second delay time at 50% wet mix

Original piano riff recording

With a more percussive sound like the piano or drums, we have to be careful to avoid creating a discernable echo in the early reflections as this won’t sound natural.  At a lower mix setting and relatively short delay time, this doesn’t seem to be too much of a problem in the above examples, but in the more extreme case of the 3.6 second delay, the reverb doesn’t hold up.  The decay feels unnatural and there is coloration on the sound.  There are few reverbs, however, that adhere to the one-size-fits-all model, and perhaps the Moorer design is a little more applicable to shorter reverb lengths.  But there is more experimentation to be done.  More tweaking.  Moorer did propose that additional filters could be inserted to further help shape the reverb decay and account for high frequency absorption and distance, and in experimenting around with all the numbers in the euqation, perhaps some really interesting things will happen.

Digital Reverberation

In continuing to explore the many areas of digital signal processing, reverb has cropped up many times as an area of great interest, so I’ve decided to dedicate a series of future posts on this topic.  I’m going to start at the beginning, looking at Schroeder’s design, the first digital reverberator solution, and proceed forward looking at how it’s design was improved upon by Moorer, leading eventually to Feedback Delay Networks (FDN) and other types of artificial reverbs.  All of these stages will include actual implementation, with code/algorithms, and possibly some plug-ins as a result.  However, my goal is not to develop any kind of high-end, competetive product at this point, as some commercial reverb algorithms are closely guarded secrets.  Moreover, digital reverb remains as one of the foremost challenges in DSP.  This process will, however, provide greater understanding of digital audio in addition to honing my skills in DSP coding and design.

Reverberation is of course just a dense series of echoes.  There is also a loss of energy in particular frequency ranges that depend on the material the sound bounces off of.  When all the complexities of natural reverb are accounted for, calculations to simulate this reach into the hundreds of billions or more per second!  Human ears cannot fully perceive the full compelxity of natural reverb, however, so this makes the calculations required much more manageable for many reverb designs (convolution is still very computationally expensive, though).

One of the fundamental building blocks of digital reverb is the comb filter, which Schroeder used in his design.  It circulates a signal through a delay line, adding the delayed version, scaled with a constant, g, to the original.

Comb filter design

The constant g is given by the formula:

where tau (t) is the delay time, or loop time, of the comb filter and RVT is the reverb time desired, which is defined as the time it takes for the delayed signal to reach -60dB (considered silence).

When analyzing the impulse response of natural reverberation, however, we see many dense series of echoes that are not equally spaced out with apparently random amplitudes.  Additionally, the echoes become more diffuse as the amplitudes decrease as the delayed signals build up in the space.  This leads to one of the most important properties of good reverb design, which is the diffusion of the delayed signal’s echoes — in other words it would be unnatural to hear individual pulses as the signal becomes reverberated.  Schroeder proposed the use of four comb filters (in parallel) as one of his solutions to this problem, each with it’s own distinct loop time.  To further ensure the diffusion of echoes, the four loop times should be relatively prime, otherwise the delayed signals would match up too frequently in phase to create a pumping or puffing sound, especially noticeable in the decay.

Another important property of reverb is for the decay to be exponential.  This is satisfied by the comb filter, as can be seen in the above diagram, whereby the impulse response will start out at 1 (assuming an impulse at amplitude 1) and then subsequently being scaled by g, then g2, g3, etc.

To further thicken up the sound of his reverberator, Schroeder fed the summed signals from the four comb filters through two all-pass filters in series.  These filters allow all frequencies to pass, but alter the phase of varying frequencies.  Their design is very much like a comb filter but with a feed-forward section, as can be seen below.

All-pass filter design

The two all-pass filters Schroeder uses also have their own unique loop times just as the comb filters. Unlike the comb filters, however, the reverb time specified for the all-pass filters are different because their purpose is to thicken and diffuse the echoes of the signal, not to apply additional reverberation.

Schroeder accompanied his design with suggested values to simulate a concert hall.  These values are given below (source: Dodge & Jerse, “Computer Music”, pg. 301):

Values for Schroeder’s Reverberator, simulating a concert hall

The RVT value of the comb filters is variable and can be specified by the user, but is normally around the order of 1.o second.

The actual implementation of these two filters is fairly straightforward in C++.  The code is given below:

Code implementing a comb filter

Code implementing an all-pass filter

Now let’s look at some audio samples to hear how this all sounds.  All the code was written by me, including implementation of the comb filters and all-pass filters as well as the mix.  Furthermore, I implemented a wet/dry option into the mixing stage as well as an output level due to the fact that the processed audio can increase in levels quite a bit depending on the source audio.  As far as mixing goes, at its most basic it is just adding signals together, but when mixing several audio buffers (as in the four parallel comb filters) it is a good idea to scale each sample by a factor of 1/N, where N = number of audio buffers being mixed ( 1/(sqrt(N) can also be used in some cases).

Guitar strum, original audio

Guitar strum, single comb filter

In the above example with the single comb filter applied (with a loop time of 29.7 msec) we can hear the distinct echoes/delays of the signal at the beginning.  As the audio decays we can also hear some unnatural pulsation happening (some pulsation is present in the original audio, but the comb filter augments it).

Guitar strum, 4 comb filters & 2 all-pass filters, 100% wet

Adding in all the comb filters and the 2 all-pass networks as per Schroeder’s design diffuses the echoes noticeably and the tail sounds a little more natural as well.  But for a more realistic sound we of course need the dry signal in the mix as well.

Guitar strum, 30% wet mix

It’s worth listening to a more percussive sound to hear the reverb’s effect on it.  Here is a short piano riff and a single comb filter applied to it, and the echo effect is very noticeable and quite disturbing.

Piano riff, original audio

Piano riff, single comb filter

Now applying the reverb in its entirety onto the piano riff with a 30% wet mix results in a more natural reverb.

Piano riff, 30% wet mix

It is, however, not perfect by any means.  We can still hear a slight echo after each attack, and the reverb sound is a little bright and metallic sounding.  As stated at the beginning, the echoes from reverberation lose energy as well as amplitude as they reflect off surfaces and travel through air, and this has not been accounted for in this design.  To improve on this, adding in a simple low-pass filter in the comb filters was used as a solution.  This will be one of the things I’ll be looking at going forward as well as more elaborate reverb designs that attempt to more realistically simulate natural reverberation.