The Different “Sides” of Convolution

We all know how much convolution is used in DSP, and what a significant part it has in making many effects and analysis techniques a reality.  Often we hear about FFT, or fast convolution, but in actuality these algorithms are only faster than straight convolution with a kernel length (of the impulse response) greater than around 60 – 64 or so.  Anything shorter than that is best handled with normal convolution, which can be implemented using either an input side or output side algorithm.

The input side algorithm is normally the one that we learn first (at least it was for me, and judging from many texts and books on it, it seems to be common).  However, the output side algorithm is perhaps slightly easier to code, and we’ll see why later on.  For this post I’m going to evaluate the effects each algorithm has on the resulting audio signal because each one could have an advantage in some situations over the other.  I’ll also share a little trick that will speed up the convolution process for certain types of filters.

The output side algorithm performs the convolution, as one might expect, from the viewpoint of the output (or the result) of the operation, while the input side algorithm implements convolution from the viewpoint of the input signal.  To look at it another way, recall that convolving an input signal with an impulse response results in an output signal that is the length of the input + the length of the impulse response – 1.  With the input side algorithm, this “extra” bit of length is appended to the resulting signal, and is probably the easiest one to theoretically grasp.  The output side algorithm looks at what points from the input we need in order to get our resulting signal.

The reason why the output side method feels more natural to code is that we often write in terms of the output or result: y(n) = ____ — the output is equal to some kind of expression, for example.  Approaching convolution this way requires input samples on the negative side (x[-2], x[-1], etc.), which of course don’t exist.  To get around that little problem, one common way is to implement a delay line that contains all zeroes at first, then shifting in the input signal as we go, doing the calculating with the impulse response and the delay line instead of direclty with the input.

I’ve been busy redesigning and overhauling my DSP filter class, which has entailed some experimentation with these two convolution methods on FIR filters.  Here we can see visually the difference between them.  First up is the output side convolution of a windowed-sinc FIR filter using a Kaiser window of order 162 on a simple triangle wave (using test oscillators is a good way to visually see what’s happening).

Output side convolution on a triangle wave.

I purposely made the filter kernel (impulse response) fairly long so it would illustrate the delay and shift that’s happening at the start of the signal due to the output side algorithm.  Compare that with the image of the input side algorithm using exactly the same filter and order.

Input side convolution on a triangle wave.

Huge difference.  It’s clearly audible as well in this test signal (samples below).  With the output side algorithm there is a clear popping sound as the audio abruptly starts in contrast to the much smoother beginning of the test signal filtered with the input side algorithm.

Here you can hear the difference (to hear it clearly you may have to download the files and listen in an audio editor to avoid the glitchy start of web browser plug-ins):

Triangle wave filtered with output side

Triangle wave filtered with input side

Now to return to the point about the output side method being perhaps a little more natural to code.  The little wrinkle we face in implementing the input side algorithm is that with block processing (which is so common in DSP), how do we account for the longer output as a result of the convolution?  In many cases we don’t have control over the size of the audio buffers we’re given or have to work with.  The solution is to use the overlap-add method.

It’s really not more difficult than the output side algorithm.  All we have to do is calculate the convolution fully (into our own internal buffer if we have to), save the extra length at the end (which will be equal to the impulse response length), then add that to the start of the next block.  Rinse and repeat.  Here is C++ code that implements both of these convolution methods.

Output side convolution process.

Input side convolution process.

Don’t worry, I didn’t forget to share the trick I mentioned at the beginning.  Many FIR filters have linear phase response, which means that their kernels are symmetrical.  That provides us a great opportunity to eliminate extra calculations that aren’t needed.  So notice in the above code each ‘j’ loop (the kernel or impulse response loop) only traverses half the kernel length, as the value of input[i] * mKernel[j] is the same as input[i] * mKernel[mKernelLength-j-1].  At the end of the kernel’s loop we calculate the midpoint value.

Again, the calculations involving symmetry is perhaps easier to see in the output side algorithm, because we naturally gravitate towards expressions that result in a single output value.  If you work a small convolution problem out on paper, however, it will help you see the input side algorithm and why it works.

Digital Reverberation

In continuing to explore the many areas of digital signal processing, reverb has cropped up many times as an area of great interest, so I’ve decided to dedicate a series of future posts on this topic.  I’m going to start at the beginning, looking at Schroeder’s design, the first digital reverberator solution, and proceed forward looking at how it’s design was improved upon by Moorer, leading eventually to Feedback Delay Networks (FDN) and other types of artificial reverbs.  All of these stages will include actual implementation, with code/algorithms, and possibly some plug-ins as a result.  However, my goal is not to develop any kind of high-end, competetive product at this point, as some commercial reverb algorithms are closely guarded secrets.  Moreover, digital reverb remains as one of the foremost challenges in DSP.  This process will, however, provide greater understanding of digital audio in addition to honing my skills in DSP coding and design.

Reverberation is of course just a dense series of echoes.  There is also a loss of energy in particular frequency ranges that depend on the material the sound bounces off of.  When all the complexities of natural reverb are accounted for, calculations to simulate this reach into the hundreds of billions or more per second!  Human ears cannot fully perceive the full compelxity of natural reverb, however, so this makes the calculations required much more manageable for many reverb designs (convolution is still very computationally expensive, though).

One of the fundamental building blocks of digital reverb is the comb filter, which Schroeder used in his design.  It circulates a signal through a delay line, adding the delayed version, scaled with a constant, g, to the original.

Comb filter design

The constant g is given by the formula:

where tau (t) is the delay time, or loop time, of the comb filter and RVT is the reverb time desired, which is defined as the time it takes for the delayed signal to reach -60dB (considered silence).

When analyzing the impulse response of natural reverberation, however, we see many dense series of echoes that are not equally spaced out with apparently random amplitudes.  Additionally, the echoes become more diffuse as the amplitudes decrease as the delayed signals build up in the space.  This leads to one of the most important properties of good reverb design, which is the diffusion of the delayed signal’s echoes — in other words it would be unnatural to hear individual pulses as the signal becomes reverberated.  Schroeder proposed the use of four comb filters (in parallel) as one of his solutions to this problem, each with it’s own distinct loop time.  To further ensure the diffusion of echoes, the four loop times should be relatively prime, otherwise the delayed signals would match up too frequently in phase to create a pumping or puffing sound, especially noticeable in the decay.

Another important property of reverb is for the decay to be exponential.  This is satisfied by the comb filter, as can be seen in the above diagram, whereby the impulse response will start out at 1 (assuming an impulse at amplitude 1) and then subsequently being scaled by g, then g2, g3, etc.

To further thicken up the sound of his reverberator, Schroeder fed the summed signals from the four comb filters through two all-pass filters in series.  These filters allow all frequencies to pass, but alter the phase of varying frequencies.  Their design is very much like a comb filter but with a feed-forward section, as can be seen below.

All-pass filter design

The two all-pass filters Schroeder uses also have their own unique loop times just as the comb filters. Unlike the comb filters, however, the reverb time specified for the all-pass filters are different because their purpose is to thicken and diffuse the echoes of the signal, not to apply additional reverberation.

Schroeder accompanied his design with suggested values to simulate a concert hall.  These values are given below (source: Dodge & Jerse, “Computer Music”, pg. 301):

Values for Schroeder’s Reverberator, simulating a concert hall

The RVT value of the comb filters is variable and can be specified by the user, but is normally around the order of 1.o second.

The actual implementation of these two filters is fairly straightforward in C++.  The code is given below:

Code implementing a comb filter

Code implementing an all-pass filter

Now let’s look at some audio samples to hear how this all sounds.  All the code was written by me, including implementation of the comb filters and all-pass filters as well as the mix.  Furthermore, I implemented a wet/dry option into the mixing stage as well as an output level due to the fact that the processed audio can increase in levels quite a bit depending on the source audio.  As far as mixing goes, at its most basic it is just adding signals together, but when mixing several audio buffers (as in the four parallel comb filters) it is a good idea to scale each sample by a factor of 1/N, where N = number of audio buffers being mixed ( 1/(sqrt(N) can also be used in some cases).

Guitar strum, original audio

Guitar strum, single comb filter

In the above example with the single comb filter applied (with a loop time of 29.7 msec) we can hear the distinct echoes/delays of the signal at the beginning.  As the audio decays we can also hear some unnatural pulsation happening (some pulsation is present in the original audio, but the comb filter augments it).

Guitar strum, 4 comb filters & 2 all-pass filters, 100% wet

Adding in all the comb filters and the 2 all-pass networks as per Schroeder’s design diffuses the echoes noticeably and the tail sounds a little more natural as well.  But for a more realistic sound we of course need the dry signal in the mix as well.

Guitar strum, 30% wet mix

It’s worth listening to a more percussive sound to hear the reverb’s effect on it.  Here is a short piano riff and a single comb filter applied to it, and the echo effect is very noticeable and quite disturbing.

Piano riff, original audio

Piano riff, single comb filter

Now applying the reverb in its entirety onto the piano riff with a 30% wet mix results in a more natural reverb.

Piano riff, 30% wet mix

It is, however, not perfect by any means.  We can still hear a slight echo after each attack, and the reverb sound is a little bright and metallic sounding.  As stated at the beginning, the echoes from reverberation lose energy as well as amplitude as they reflect off surfaces and travel through air, and this has not been accounted for in this design.  To improve on this, adding in a simple low-pass filter in the comb filters was used as a solution.  This will be one of the things I’ll be looking at going forward as well as more elaborate reverb designs that attempt to more realistically simulate natural reverberation.