Let’s start it off with some music:
Vibrato has always been an essential technique in making music feel more alive, rich, and full of expression. Whether it is string, wind, or brass players in an orchestra, a singer, or a synthesized waveform in an old 8-bit NES game, vibrato ensures those long notes and phrases connect with us in a more meaningful way by giving them character and shape.
Unlike tremolo (which was the subject of the previous blog entry), vibrato modulates pitch, not amplitude, using an LFO. When generating a waveform using synthesis, this is a trivial matter as we have direct access to the frequency component. But with prerecorded audio, the vibrato effect is a achieved through the use of modulated variable delay. To better understand this, let’s start off by looking at a basic delay effect implemented in C++ code.
The way a simple delay works is by creating a buffer with a length equal to the delay time (making sure to initialize it to contain all zeroes) and then as we process the audio buffer, we transfer each sample from it into the delay buffer while extracting values from the delay buffer and mixing it with the original audio buffer. Since the delay buffer is initialized to contain all zeroes, the first pass through it will do nothing to the original audio, but upon completing the first pass the delay buffer will contain the samples from the audio that will then be mixed in, creating the delay. By using a delay time of 0.5 seconds (which would require the delay buffer to contain 22050 samples assuming a sample rate of 44.1kHz), and a ‘depth’ of 45% or so, the following code would generate a single half-second slap-back delay, or echo, at 45% of the original amplitude:
Adapting this code to create a vibrato effect isn’t too complex, but it does require a few steps that might seem a bit hard to grasp at first. We need to create a variable delay and this requires two pointers to our delay buffer — a writing pointer that will proceed sample by sample as in the basic delay above, and a reading pointer that will be calculated in relation to the writing pointer and modulated by the LFO. The reading position will almost always fall between buffer positions, so interpolation is required to achieve more accurate output. With these points considered, the variable delay code becomes:
It was here that I first encountered a big roadblock in writing my vibrato effect. Upon testing it on a number of soundfiles, I was getting a moderate amount of distortion, or sample noise, in my output. Having already learned from similar challenges in writing the tremolo effect previously, I was fairly certain this was a new issue I had to tackle. The test that led me to the source of the problem was using a constant delay time in the code above (no modulation by the sine wave) and that produced a clean output. From here, I knew the problem had to lie in how I was calculating the offset using the sine wave modulator. Originally I calculated it like this:
offset = (delay time * sine wave(phase)) * sample rate,
where the phase of the sine wave increments by the value of 2 * pi * freq / SR. After doing some research (and hard thinking on the matter), it became clear that this was the wrong mathematical operation because multiplying the modulator with the delay time scales it; we want to move “around” it (i.e. vibrato fluctuates pitch by a small amount around a central pitch). That eventually led me to come up with the following base equation:
offset = (delay time + sine wave(phase) * delay time) * sample rate.
This equation needs a couple more modifications since it isn’t modulating “around” the delay time yet, just adding to it. A depth modifier needs to be included in here as well so that we can change the intensity of the vibrato effect (by modifying the magnitude of the sine wave). The final equation then becomes:
offset = (delay time/2 + (sine wave(phase) *depth) * delay time/2) * sample rate,
which simplifies to:
offset = (delay time/2 * (1 + sine wave(phase) * depth)) * sample rate.
This finally created the expected output I was after! It’s such a great feeling to solve logical programming challenges! Here is an example of the output with a vibrato rate of 8.6Hz at 32% depth:
One other important element to discuss is the actual delay time used to generate the vibrato effect. I experiemented around with many values before settling on a delay time of 0.004 seconds, which is the value that we “delay around” using the sine wave. I found as the values got smaller than 0.004 seconds that the sound of the effect degraded, and actually resulted in some sample noise because the delay buffer became so small (nearing as few as only 30 samples). As the delay time increases, the pitch of the audio begins to vary so much that we actually lose almost all pitch in the original audio.
This is not necessarily a bad thing. This opens up vibrato to be used as a sound effect rather than purely a musical expression tool. By setting the delay time to 0.03 seconds for example, the vibrato effect generates an output not unlike a record-scratch or something resembling flanging (which is actually also achieved through the use of variable delay). See if you can recognize the source music in this sample:
Of course a more subtle effect is often desired for musical purposes and this is controlled by the depth modifier. Here is a sample of a more subtle vibrato effect (back to the delay time of 0.004 seconds):
One final thing to mention in regards to applying the vibrato effect onto prerecorded audio is that it can distort the sound somewhat when the audio used is a fully realized composition. The vibrato is of course being applied on to the entire file (i.e. every instrument, every sound). A more practical application would be to use vibrato on a single instrument source; a flute for example (please excuse my horrible flute playing):
Last, but not least, it is important to consider the implementation and design of the code that applies the effect. I have continued to code these effects as C++ classes using object-oriented design as it makes the implementation of them very easy and efficient. For example, calling the effect in the main loop of the program is as trivial as:
Here we can see that first we read sample data in from the soundfile and store it in ‘buffer’. Then the ‘buffer’ is passed, along with the LFO modulator, into the process that applies the variable delay (vibrato in this case), and this is then written to the output soundfile. The LFO modulator used for the vibrato is just a new instance of the oscillator class I developed for the tremolo effect previously. I just initialize a new instance of it for use in the vibrato effect, and done!
This is an example of the benefits of object-oriented design and how adaptable it is. We’ll be seeing much more of this to come as well. For example, it would require a few trivial code changes to set up multi-tap delays, each with their own depth, and even to incorporate filters into the delays once I get into developing them. And finally, allowing the use of envelopes to further shape these effects will be an important step to be taken in the future. With so many tantalizing possibilities, there’s no stopping now!