Tag Archives: VST

AdVerb: Building a Reverb Plug-In Using Modulating Comb Filters

Some time ago, I began exploring the early reverb algorithms of Schroeder and Moorer, whose work dates back all the way to the 1960s and 70s respectively.  Still their designs and theories inform the making of algorithmic reverbs today.  Recently I took it upon myself to continue experimenting with the Moorer design I left off with in an earlier post.  This resulted in the complete reverb plug-in “AdVerb”, which is available for free in downloads.  Let me share what went into designing and implementing this effect.

One of the foremost challenges in basing a reverb design on Schroeder or Moorer is that it tends to sound a little metallic because with the number of comb filters suggested, the echo density doesn’t build up fast or dense enough.  The all-pass filters in series that come after the comb filter section helps to diffuse the reverb tail, but I found that the delaying all-pass filters added a little metallic sound of their own.  One obvious way of overcoming this is to add more comb filters (today’s computers can certainly handle it).  More importantly, however, the delay times of the comb filters need to be mutually prime so that their frequency responses don’t overlap, which would result in increased beating in the reverb tail.

To arrive at my values for the 8 comb filters I’m using, I wrote a simple little script that calculated the greatest common divisor between all the delay times I chose and made sure that the results were 1.  This required a little bit of tweaking in the numbers, as you can imagine finding 8 coprimes is not as easy as it sounds, especially when trying to keep the range minimal between them.  It’s not as important for the two all-pass filters to be mutually prime because they are in series, not in parallel like the comb filters.

I also discovered, after a number of tests, that the tap delay used to generate the early reflections (based on Moorer’s design) was causing some problems in my sound.  I’m still a bit unsure as to why, though it could be poorly chosen tap delay times or something to do with mixing, but it was enough so that I decided to discard the tap delay network and just focus on comb filters and all-pass filters.  It was then that I took an idea from Dattorro and Frenette who both showed how the use of modulated comb/all-pass filters can help smear the echo density and add warmth to the reverb.  Dattorro is responsible for the well-known plate reverbs that use modulating all-pass filters in series.

The idea behind a modulated delay line is that some oscillator (usually a low-frequency sine wave) modulates the delay value according to a frequency rate and amplitude.  This is actually the basis for chorusing and flanging effects.  In a reverb, however, the values need to be kept very small so that the chorusing effect will not be evident.

I had fun experimenting with these modulated delay lines, and so I eventually decided to modulate one of the all-pass filters as well and give control of it to the user, which offers a great deal more fun and crazy ways to use this plug-in.  Let’s take a look at the modulated all-pass filter (the modulated comb filter is very similar).  We already know what an all-pass filter looks like, so here’s just the modulated delay line:

Modulated all-pass filter.

Modulated all-pass filter.

The oscillator modulates the value currently in the delay line that we then use to interpolate, resulting in the actual value.  In code it looks like this:

double offset, read_offset, fraction, next;
size_t read_pos;

offset = (delay_length / 2.) * (1. + sin(phase) * depth);
phase += phase_incr;
if (phase > TWO_PI) phase -= TWO_PI;
if (offset > delay_length) offset = delay_length;

read_offset = ((size_t)delay_buffer->p - (size_t)delay_buffer->p_head) / sizeof(double) - offset;
if (read_offset < 0) {
    read_offset = read_offset + delay_length;
} else if (read_offset > delay_length) {
    read_offset = read_offset - delay_length;
}

read_pos = (size_t)read_offset;
fraction = read_offset - read_pos;
if (read_pos != delay_length - 1) {
    next = *(delay_buffer->p_head + read_pos + 1);
} else {
    next = *delay_buffer->p_head;
}

return *(delay_buffer->p_head + read_pos) + fraction * (next - *(delay_buffer->p_head + read_pos));

In case that looks a little daunting, we’ll step through the C code (apologies for the pointer arithmetic!).  At the top we calculate the offset using the delay length in samples as our base point.  The following lines are easily seen as incrementing and wrapping the phase of the oscillator as well as capping the offset to the delay length.

The next line calculates the current position in the buffer from the current position pointer, p, and the buffer head, p_head.  This is accomplished by casting the pointer addresses to integral values and dividing by the size of the data type of each buffer element.  The read_offset position will determine where in the delay buffer we read from, so it needs to be clamped to the buffer’s length as well.

The rest is simply linear interpolation (albeit with some pointer arithmetic: delay_buffer->p_head + read_pos + 1 is equivalent to delay_buffer[read_pos + 1]).  Once we have our modulated delay value, we can finish processing the all-pass filter:

delay_val = get_modulated_delay_value(allpass_filter);

// don't write the modulated delay_val into the buffer, only use it for the output sample
*delay_buffer->p = sample_in + (*delay_buffer->p * allpass_filter->g);
sample_out = delay_val - (allpass_filter->g * sample_in);

The final topology of the reverb is given below:

Topology of the AdVerb plug-in.

Topology of the AdVerb plug-in.

The pre-delay is implemented by a simple delay line, and the low-pass filters are of the one-pole IIR variety.  Putting the LPFs inside the comb filters’ feedback loops simulates the absorption of energy that sound undergoes as it comes in contact with surfaces and travels through air.  This factor can be controlled with a damping parameter in the plug-in.

The one-pole moving-average filter is there for an extra bit of high frequency roll-off, and I chose it because this particular filter is an FIR type and has linear phase so it won’t add further disturbance to the modulated samples entering it.  The last (normal) all-pass filter in the series serves to add extra diffusion to the reverb tail.

Here are some short sound samples using a selection of presets included in the plug-in:

Piano, “Medium Room” preset

The preceding sample demonstrates a normal reverb setting.  Following are a few samples that demonstrate a couple of subtle and not-so-subtle effects:

Piano, “Make it Vintage” preset

Piano, “Bad Grammar” preset

Flute, “Shimmering Tail” preset

Feel free to get in touch regarding any questions or comments on “AdVerb“.

Advertisements

New Plug-In: Dyner (ring modulator)

Finally I have gotten the chance to polish up and release a new plug-in.  This one is a ring modulator called Dyner (ring modulation is basically a type of heterodyning).  I’ve spoken about this effect in a previous blog post, but just to quickly summarize, ring modulation is accomplished by multiplying a modulator signal (usually an oscillator of some kind) with the carrier signal (the audio).  By itself it’s a very easy and straightforward operation, but unless you’re only modulating with a sine wave, aliasing becomes a big issue to deal with.  Many free ring modulators that I’ve tried do not tackle this problem in any way, so I was determined to do so in Dyner.

This meant writing my own resampler to handle the oversampling of the operation.  As I outlined in previous posts regarding resampling, this was no mean feat in making sure the process was accurate and efficient.  However, without regard to the oscillator itself, a very strong anti-aliasing filter would be required, severely impacting the CPU load and execution of the effect.  The next task was then to create fully band-limited oscillators.  In this plug-in I’m just using the wavetable approach, with each oscillator using 10 different tables calculated with a unique number of harmonics appropriate for the octave range so they don’t alias.  Now that it was ensured that the oscillators themselves would obey the sampling theorem, oversampling by 2X using a light filter is so far giving good results.

The plug-in (VST 2.4, for both Mac and Windows, 32-bit & 64-bit) can be downloaded for free in Downloads.

Dyner GUI

Dyner GUI

Features:

  • Five oscillators (sine, sawtooth up, sawtooth down, triangle, square)
  • Oscillators are fully band-limited and FFT generated for optimum performance
  • 2X oversampling to avoid aliasing effects
  • Smooth knob controls (mouse/mouse wheel, type in value manually) capable of fine granularity for parameter settings

Requires:

  • VST 2.4 capable host
  • SSE3 processor support
  • Mac: OS X 10.6.8+
  • Windows: Only tested on Windows 7 so far, but should be compatible with Vista/XP at least

Building a Comb Filter in Audio Units

Now as I am looking into and learning more about digital reverberation, including its implementation and theory, I decided to build a simple comb filter plug-in using Audio Units.  Previously all the plug-in work I’ve done has been using VST, but I was anxious to learn another side of plug-in development, hence Apple’s Audio Units.  It is, truth be told, very similar to VST development in that you derive your plug-in as a subclass of Audio Unit’s AUEffectBase class, inheriting and overwriting functions accordingly to the needs of your effect.  There are some notable differences, however, that are worth pointing out.  In addition, I’ve put up the plug-in available for download on the Downloads page.

The structure of an Audio Unit differs from VST in that within the main interface of the plug-in, a kernel object that is derived from AUKernelBase handles the actual DSP processing.  The outer interface as subclassed from AUEffectBase handles the view, parameters, and communication with the host.  What’s interesting about this method is that the Audio Unit automatically handles multichannel audio streams by initializing new kernels.  This means that the code you write within the Process() function of the kernel object is written as if to handle mono audio data.  When the plug-in detects stereo data it simply initializes another kernel to process the additional channel.  For n-to-n channel effects, this works well.  Naturally options are available for effects or instruments that require n-to-m channel output.

Another benefit of this structure is the generally fast load times of Audio Unit plug-ins.  The plug-in’s constructor, invoked during its instantiation, should not contain any code that requires heavy lifting.  Instead this should be placed within the kernel’s constructor, the initialization, so that any heavy processing will only occur when the user is ready for it.  Acquring the delay buffer in the comb filter happens in the kernel’s constructor, as indicated below, while the plug-in’s constructor only sets up the initial parameter values and presets.

Comb Filter kernel constructor

Comb Filter base constructor

The parameters in Audio Units also differ from VST in that they are not forced to be floating point values that the programmer is responsible for mapping for the purpose of displaying in the UI.  Audio Units comes with built-in categories for parameters which allow you to declare minimum and maximum values for in addition to a default value that is used for when the plug-in instantiates.

Declaring parameters in GetParameterInfo()

Like VST, Audio Units contains a function called Reset() that is called whenever the user starts or stops playback.  This is where you would clear buffers or reset any variables needed to return the plug-in to an initialized state to avoid any clicks, pops, or artifacts when playback is resumed.

Performing clean-up in Reset()

Because a comb filter is essentially a form of delay, a circular buffer is used (mDelayBuf) to hold the delayed audio samples.  In real-time processing where the delay time can change, however, this has repercussions on the size of the buffer used, as it would normally be allocated to the exact number of samples needed to hold the data.  But rather than deallocating and reallocating the delay buffer every time the delay time changes (requiring multiple memory accesses), I allocate the buffer to its maximum possible size as given by the maximum value allowed for the delay time.  As the delay time changes, I keep track of its size with the curBufSize variable, and it is this value that I use to wrap around the buffer’s cursor position (mPos).  This happens within the Process() function.

Comb Filter’s Process() function

Every time Process() is called (which is every time the host sends a new block of samples to the plug-in), it updates the current size of the buffer and checks to make sure that mPos does not exceed it.  The unfortunate consequence of varying the delay time of an effect such as this is that it results in pops and artifacting when it is changed in real time.  The reason being that when the delay time is changed in real time, samples are lost or skipped over, resulting in non-contiguous samples causing artifacting.  This could be remedied by implementing the Comb Filter as a variable delay, meaning when the delay time changes in real time, interpolation is used to fill in the gaps.  As it stands, however, the delay time is not practically suited for automation.

Yet another distinction with Audio Units is the requirement for validation to be usable in a host.  Audio Units are managed by OS X’s Component Manager, and this is where hosts check for Audio Unit plug-ins.  To validate an Audio Unit, a tool called “auval” is used.  This method has both pros and cons to it.  The testing procedure helps to ensure any plug-in behaves well in a host, it shouldn’t cause crashes or result in memory leaks.  While I doubt this method is foolproof, it is definitely useful to make sure your plug-in is secure.

Correction: Audio Units no longer use the Component Manager in OS X 10.7+. Here is a technical note from Apple on adapting to the new AUPlugIn entry point.

The downside to it is that some hosts, especially Logic, can be really picky with which plug-ins it accepts.  I had problems loading the Comb Filter plug-in for the simple reason that version numbers didn’t match (since I was going back and forth between debug and release versions), and so it failed Logic’s validation process.  To remedy this, I had to clear away the plug-in from its location in /Library/Audio/Plug-Ins/Components and then, after reinstalling it, open the AU Manager in Logic to force it to check the new version.  This got to be a little frustrating after having to add/remove versions of the plug-in for testing, especially since it passed successfully in auval.  Fortunately it is all up and running now, though!

Comb Filter plug-in in Logic 8

Finally, I’ll end this post with some examples of me “monkey-ing” around with the plug-in in Logic 8, using some of the factory presets I built into it.

Comb Filter, metallic ring preset

Comb Filter, light delay preset

Comb Filter, wax comb preset

The Making of a Plug-In: Part 5 (Beta & new features)

In this post I’m going to discuss two new features I have added to the Match Envelope plug-in that I’m pretty excited about.  And with some additional bug fixing, it’s in a good workable state, so I can offer up the plug-in as a Beta version.

The first of the new features I added is an option to invert the envelope, so instead of matching the source audio that you extract the envelope from, the resulting audio is opposite in shape.  Of course this can further be tweaked by use of the “match strength” and “gain” parameters.  The actual process of doing this is very simple; take the interpolated value (from cubic interpolation) ‘ival’ and subtract it from 1 (digital waveforms in floating point representation all have sample values between -1 to 1).  As simple as this was, it did introduce a bug that took a long time to find.

Occasionally I would get artefacting after applying the process when inverting the envelope, and I discovered that the resuling interpolated value was nan in these cases (nan = not a number).  When the source audio consisted of sharp attacks, and thus sharp rises in the waveform, the interpolated value exceeded 1 by a small amount.  Then, when it gets to this point in the formula for calculating match strength:

pow(ival, matchStr),

it will result in an imaginary number when matchStr is around 0.5.  In essence, this part of the formula ends up trying to square-root a negative number.  The easy fix for this was just to take the absolute value of ‘ival’ such that pow(fabs(ival), matchStr)) will then never result in nan.  This is a better solution than to just “floor” ival to 0 if it is negative because this will actually alter the audio slightly by messing with the interpolation.

The second feature I added is a “window offset” parameter, which shifts the envelope left or right.  Oddly enough, this one took less time to implement and required less testing/fixing than the invert feature even though it sounds more complex to code.  In fact, it’s pretty simple as well.  Similar to a circular buffer, instead of shifting the elements around in the buffers that contain the envelope data, I just offset the cursor that points to the location in the buffer.

If I want to shift the envelope to the left (the result will in effect anticipate the source audio’s envelope), the cursor needs to be offset by a positive number.  If I want to shift the envelope to the right (emulating a delay of the envelope), the cursor needs to be offset by a negative number.  This may seem a bit unintuitive at first, but we need to consider how the offset/placement of the cursor affects how the processing begins. i.e. if it is negatively offset, it will be delayed in processing the envelope data, thus shifting the envelope to the right.

Before moving on, here is a new video showcasing these new features, this time with some slightly more exciting audio for demonstration.

Lastly, in terms of bug fixing, I had to deal with an issue that arose in Soundforge due to the fact that it does not save the state of a plug-in once it’s invoked.  In other words, when a plug-in is opened in Audacity, it is not “destroyed” until the program is exited; it only switches to a suspend state while it’s not active.  This means that a plug-in’s constructor will only be called once, thus saving the state of parameters/variables between uses as long as the host isn’t closed.  Soundforge does not do this, however, and this caused problems with parameter and variable reinitialization.

Fortunately a fix was found, but it has further reaffirmed that VST is not really built for offline processing, and while we can certainly coax them into it, I’ve hit several limitations in terms of what I can accomplish within the bounds of the VST SDK as well as inadequate support for the offline capabilities it does provide by hosts.

As such, this might be one of the last entries in this particular making-of series on the Match Envelope plug-in.  I have learned a ton through this process, and through sticking it out when it became clear that this particular process is probably better suited to a standalone app or command-line program where I could have had much more control over things.

I look very much forward to developing my next plug-in, however, which will most assuredly be a real-time process of some kind, and I am excited to learn and to tackle the challenges that await!  Without further ado, here are links to the beta of the Match Envelope VST plug-in:

Match Envelope beta v0.9.2.23 — Mac

Match Envelope beta v0.9.2.23 — Win

Cheers!

The Making of a Plug-In: Part 4 (Alpha Version)

After all the work that’s gone into this plug-in, it’s pretty darn cool to see it up and running in a host program!  I’ve been messing around with it a bit in Audacity, just trying out different parameter setting and seeing it all come together.  There’s more work to be done in order to take care of some additional issues that may crop up, but for the most part I have a working alpha version of the Match Envelope plug-in.  I will likely graduate it to beta once I take care of these issues, which isn’t far off, but in the meantime I am providing the alpha version for anyone interested to give it a shot (links at the end of this post).  I also recorded a short little video demonstrating the basic functionality of the plug-in (again, these particular audio files were chosen to illustrate the effect visually):

One of the primary issues that I need to address is differing sample rates between the source and destination audio.  This is something I have yet to try out in code, but I expect some sort of scaling of certain parameters will solve the problem rather than forbidding mixed sample rates (which would be a bit of a pain).

If, for example, we choose a window size of 30ms with the source audio at a sample rate of 44.1kHz, it will contain 1323 samples per window.  If the destination audio’s sample rate is at 48kHz, however, it will contain 1440 samples per window.  There can thus quite easily be a mismatch in the number of windows in an envelope profile between the two audio streams as well as window boundaries not matching up, which will cause the audio files to not sync or match properly.

This brings up another related issue, evident in the UI.  The Envelope Extractor contains the parameter for window duration, but what if the user selects a different value for the source envelope and the destination?  Right now it results in an incorrect match.  Should this be forbidden, or perhaps even ignored by forcing the same window duration on both envelopes?  Or perhaps there is a way to turn it into a feature.  I am undecided on this at the moment and need to do some testing and exploring of various options before addressing this issue.

In addition, I will have to decide on whether implementing user-saved programs is of any use in this plug-in.  Currently this functionality is not supported, but I think it would be a good idea to include some obvious presets, and allow users to save their own.

Other than that, I am excited to share the first distributable version of this plug-in!  The Mac OS X version has been tested on 10.6.8 and 10.8 in Audacity, and the Windows version using Windows 7 in both Audacity and Soundforge.  If you’re interested, give it a try, and though you’re certainly not obligated to, feedback is welcome:

christian@christianfloisand.com

Edit: Beta version available in Downloads

The Making of a Plug-In: Part 3 (Solving the UI Issue)

I’m both happy and relieved that progress on making the Match Envelope plug-in is proving to be successful (so far, anyway)!  It’s up and running, albeit in skeleton form, on Audacity (both Mac and Windows) and Soundforge (Windows).  As I was expecting, it hasn’t been without it’s fair share of challenges, and one of the biggest has been dealing with the UI — how will the user interact with the plug-in efficiently with the inherent limitations involved in the interface?

The crux of the problem stems from the offline-only capability of the Match Envelope plug-in.  Similar to processes like normalization, where the entire audio buffer needs to be scanned to determine its peak before scanning it a second time to apply it, I need to scan the entire audio buffer (or at least as much as the user has selected) in order to get the envelope profile before then applying that onto a different audio buffer.

This part of the challenge I foresaw as I began development.  I knew of VST’s offline features, however, and I planned to explore these options that would solve some of the interface difficulties I knew I would encounter.  What I didn’t count on was that host programs widely do not support VST offline functions, and in fact, Steinberg has all but removed the example source code for offline plug-ins from the 2.4 SDK (I’m not currently up to speed on VST3 as of yet).  Thus I have been forced to use the normal VST SDK functions to handle my plug-in.

So here is the root cause of perhaps the main challenge I had to deal with: the host program that invokes the plug-in is responsible for sending the audio buffer in blocks to the processing function, which is the only place I have access to the audio stream.  The function prototype looks like this:

void processReplacing (float **inputs, float **outputs, VstInt32 sampleFrames)

inputs‘ contains the actual audio sample data that the host has sent to the plug-in, ‘outputs‘ is where, after processing, the plug-in places the modified audio, and ‘sampleFrames‘ is the number of samples (block size) in the audio sample data.  As I mentioned earlier, not only do I need to scan the audio buffer first to acquire the envelope profile, I need to divide the audio data into windows whose size is determined by the user.  It’s pretty obvious that the number of samples in the window size will not equal the number of samples in sampleFrames (at least not 99.99998% of the time), effectively complicating the implementation of this function three-fold.

How should I handle cases where the window size is less than sampleFrames?  More than sampleFrames?  More than double sampleFrames?  Complicating matters further is that different hosts will pass different block sizes in for sampleFrames, and there is no way to tell exactly what it will be until processReplacing() is invoked.  Here is the pseudocode I used to tackle this problem:

The code determines how many windows it can process in any given loop iteration of processReplacing() given sampleFrames and windowSize and storing leftover samples in a variable that is carried over into the next iteration.  Once the end of a window is reached, the values copied from the audio buffer (our source envelope) are averaged together, or its peak is found, whichever the user has specified, and that value is then stored in the envelope data buffer.  The reasoning behind handling large windows separately from small ones is to avoid a conditional test with every sample processed to determine if the end of the window is reached.

Once this part of the plug-in began to take shape, another problem cropped up.  The plug-in requires three steps (one is optional) taken by the user in order to use it:

  1. Acquire source envelope profile from an audio track,
  2. Acquire the destination audio’s envelope profile to use the match % parameter (optional),
  3. Select the audio to apply the envelope profile onto and process.

It became clear that, since I was not using VST offline capabilities, the plug-in would need to be opened and reopened 2 – 3 times in order to make this work.  This isn’t exactly ideal and wasn’t what I had in mind for the interface, but the upside is that its been a huge learning experience.  As such, I decided to split the Match Envelope plug-in into two halves: the Envelope Extractor, and the Envelope Matcher.

I felt this was a good way to go because it separated two distinct elements of the plug-in as well as clarifying which parameters belong with which process.  i.e. The match % or gain parameters have no effect on the actual extraction of the envelope profile, only during the processing onto the destination audio.  Myself, like many others I assume, like fiddling around with parameters and settings on plug-ins, and it can get very frustrating at times when/if they have no apparent effect, and this can create confusion and possibly thoughts of bad design towards the software.

To communicate between the two halves, I implemented a system of writing the envelope data extracted to a temporary binary file that is read by the Envelope Matcher half in order to process the envelope, and this has proven to work very well.  In debug mode I am writing a lot of data out to temporary debug files in order to monitor what the plug-in is doing and how all the calculations are being done.

From Envelope Extractor:

From Envelope Matcher:

Some of these non-ideal interface features I plan on tackling with a custom GUI, which offers much more flexibility than the extremely limited default UI.  Regardless, I’m excited that I’ve made it this far and am very close to having a working version of this plug-in up and running on at least two hosts so far (Adobe Audition also supports VST and as far as I know, offline processing, but I have not been able to test it as I don’t own it yet).

After this is done, I do plan on exploring other plug-in types to compare and contrast features and flexibility (AU, RTAS, etc.), and I may find a better solution for the interface. Of course, the plug-in could work as a standalone app where I have total control over the UI and functionality, but it would lack the benefit of doing processing right from within the host.

The Making of a Plug-In: Part 2

This entry in my making of a plug-in series will detail what went into finalizing the prototype program for the Match Envelope plug-in.  A prototype of this kind is usually a command-line program wherein much of the code is actually written to implement functionality and features, and then later transferred into a plug-in’s SDK (in my case, the VST SDK).  Plug-ins, by their very nature, are not self-executable programs and need a host to run, so it is more efficient to test the fundamental code structure within a command-line program.

Having now completed my prototype, I want to first share some of the things I improved upon as well as new features I implemented.  One of the main features of the plug-in that I mentioned in part 1 was the match % parameter.  This effectively lets you control how strongly the envelope you’re matching affects the audio, and rather than being a linear effect, it is proportional to the difference between the amplitude of the envelope and the amplitude of the audio.  Originally this was the formula I used (from part 1):

We could see that this mostly gave me the results I was after, but if we look closely at the resulting waveform, there is some asymmetry in comparing it to the original (look at the bottom of the waveforms).  One of the mistakes in this formula was in comparing the interpolated value ‘ival’ with the actual sample amplitude of the ‘buffer’.  To remedy this, I now also extract the envelope of the destination audio that we are applying the envelope on to (with the same window width used to extract the source envelope) and use this value to compare the difference with ‘ival’.  This ensures a more consistent and accurate comparison of amplitudes.

The other mistake in the original formula was to linearly affect ‘a‘, the alpha value that is input by the user in %, by the term that calculates the difference between the amplitudes. So while the a value does affect the resulting ival proportionally, a itself was not.  The final equation then, just became:

I use two strategies when deciding on an appropriate mathematical formula for what I need.  One is considering how I want a value to change over time, or over some range of values, and turning to a kind of equation that does that (i.e. should it be a linear change, exponential, logarithmic, cyclical, etc.).  This leads to the second method, and that is to use a graph to visualize the shape of change I am after; this leads to an equation that defines that graph.

Here is a quick graphic and audio to illustrate these changes using the same flute source as the envelope and triangle wave as its destination from part1:

Flute envelope applied with 80msec window size at 100% match

Shortly, we will be seeing some much more interesting musical examples of the plug-in at work.  But before that, we can see another feature at work above that I implemented since last time: junction smoothing.

In addition to specifying the length of the envelope, the user also specifies a value (in msec) to smooth the transition from the envelope match to the original, unmodified audio.  Longer values will obviously make the transition more gradual, while shorter makes it more abrupt.  The process of implementing this feature turned out to be reasonably simple.  This is the basic equation:

where ‘ival’ is the interpolated value and ‘jpos’ is the current position within the bounds of the junction smoothing specified by the user.  ‘jpos’ starts at 0, and once the smoothing begins, it increments (within a normalized value) until it hits 1 at the end of the smoothing.  The larger ‘jpos’ gets, the less of the actual interpolated value we end up with in our ‘jval’ result, which is used to scale the audio buffer (just as ‘ival’ does outside of junction smoothing).  In other words, when ‘jpos’ hits 1, ‘jval’ will be 1 and so we multiply our audio buffer by 1; then we have reached the end of our process and the original audio continues on unmodified.

Before we move on, here is a musical example.  This very famous opening of Debussy’s “Prelude to the Afternoon of a Faun” seemed like a good excerpt to test my plug-in on.  This is the original audio (the opening is very very soft as most classical recordings are of quiet moments to preserve dynamic range, so I had to amplify it which is why there is some audible low noise):

Opening of “Prelude to the Afternoon of a Faun”

Using this as the source envelope, I used a window size of 250msec at 90% match to apply on to this flute line that I recorded in Logic, doubling the original from the audio above (the flute sample is from Vienna Special Edition).

Unmodified flute doubling of the Prelude opening

The result:

Flute doubling after Match Envelope

And here is the result mixed with the original audio:

Doubled flute line mixed with original audio

Junction smoothing was of course applied during the process to let the flute line fade out as in its original incarnation.  Without smoothing it would have abruptly cut off.  This gives us a seamless transition from the end of the flute solo into the orchestral answer.

It was very important in this example to specify a fairly large window duration, because we don’t want to capture the tremolo of the original flute solo as this would fight against the tremolo of the sampled recording that we are applying the envelope on to.  We can hear a little bit of this in places even with a 250msec window, so this will be something I intend to test further to see how this may be avoided or at least minimized.

This brings me to the other major challenge I faced in developing this plug-in since part 1: stereo handling.  Dealing with stereo files isn’t complicated in itself, but there were a few complexities I encountered along the way specific to how I wanted the plug-in to behave. Instead of only allowing a 1-to-1 correspondence (i.e. only supporting mono to mono, or stereo to stereo), I decided to allow for the two additional situations of mono to stereo and stereo to mono.

The first two cases are easy enough to deal with, but what should happen if the source envelope is mono and the destination audio is stereo, and vice versa?  I decided to allocate 2-dimensional arrays for the envelope buffers to hold the mono/stereo amplitude data and then use bitwise flags to store the states of each envelope:

This saves on having multiple variables representing channel states for each envelope, so I only have to pass around one variable that contains all of this information that is then parsed in the appropriate places to retrieve this information.

As we can see, only one variable (‘envFlags’) is used in the extraction of the source envelope, and we can find out mono/stereo information by using bitwise AND with the corresponding enum definition of the flag we’re after.  Furthermore, in the case of the source envelope being stereo but destination audio being mono, I combine the amplitude data of the two channels into one, according to either average or peak extraction method (also specified by user).

The difficulty in implementing this feature wasn’t so much in how to get it done, but how to get it done more efficiently, without a massive number of parameters passing around and a whole lot of conditional statements within the main processing loop to determine how many channels each envelope has.  We can see some of this at work within the main process:

I use another variable (‘stereo_src’) that extracts some information from the bitwise flags to take care of the case where the source envelope is mono but destination audio is stereo.  Since the loop covers the channels of the destination audio (in interleaved format), I needed a way to restrict out of bounds indexing of the source envelope.  If the source envelope is mono, ‘stereo_src’ will be 0, so the indexing of it will not exceed its limit.  If both envelopes are stereo, ‘stereo_src’ will be 1, so it will effectively “follow” the same indexing as the destination audio.

For the next, and last, musical example of this entry, we change things up a whole lot.  I’m going to show the application of this plug-in to electronic dance music.  This match envelope plug-in can emulate, or function as a kind of side chain compressor, which is quite commonly found in EDM.  Here is a simple kick drum pattern and a synth patch that goes on top:

Kick drum pattern

Synth pattern

By applying the match envelope plug-in to the synth pattern using the kick drum pattern above as the source envelope (window size of 100msec and 65% match), we achieve the kind of pumping pattern in the synth so characteristic of this kind of music.  The result, and mix, are as follows:

Match Envelope applied to synth pattern

Modified synth pattern mixed with percussion and bass

This part has really covered the preliminary features of what I’m planning to include in the Match Envelope plug-in.  As I stated at the start, the next step is to transfer the code into the VST SDK (that’ll be part 3), but this also comes with its share of considerations and complications, mainly dealing with UI.  How should this appear to the user?  How do you neatly package it all together to make it easy and efficient to use?  How should be parameters be presented so that they are intuitive?

Most all plug-ins/hosts offer up a default UI, which is what I’ll be working with initially, but eventually a nice graphic custom GUI will be needed (part 4? 5? 42?).