Tag Archives: Audacity

The Making of a Plug-In: Part 5 (Beta & new features)

In this post I’m going to discuss two new features I have added to the Match Envelope plug-in that I’m pretty excited about.  And with some additional bug fixing, it’s in a good workable state, so I can offer up the plug-in as a Beta version.

The first of the new features I added is an option to invert the envelope, so instead of matching the source audio that you extract the envelope from, the resulting audio is opposite in shape.  Of course this can further be tweaked by use of the “match strength” and “gain” parameters.  The actual process of doing this is very simple; take the interpolated value (from cubic interpolation) ‘ival’ and subtract it from 1 (digital waveforms in floating point representation all have sample values between -1 to 1).  As simple as this was, it did introduce a bug that took a long time to find.

Occasionally I would get artefacting after applying the process when inverting the envelope, and I discovered that the resuling interpolated value was nan in these cases (nan = not a number).  When the source audio consisted of sharp attacks, and thus sharp rises in the waveform, the interpolated value exceeded 1 by a small amount.  Then, when it gets to this point in the formula for calculating match strength:

pow(ival, matchStr),

it will result in an imaginary number when matchStr is around 0.5.  In essence, this part of the formula ends up trying to square-root a negative number.  The easy fix for this was just to take the absolute value of ‘ival’ such that pow(fabs(ival), matchStr)) will then never result in nan.  This is a better solution than to just “floor” ival to 0 if it is negative because this will actually alter the audio slightly by messing with the interpolation.

The second feature I added is a “window offset” parameter, which shifts the envelope left or right.  Oddly enough, this one took less time to implement and required less testing/fixing than the invert feature even though it sounds more complex to code.  In fact, it’s pretty simple as well.  Similar to a circular buffer, instead of shifting the elements around in the buffers that contain the envelope data, I just offset the cursor that points to the location in the buffer.

If I want to shift the envelope to the left (the result will in effect anticipate the source audio’s envelope), the cursor needs to be offset by a positive number.  If I want to shift the envelope to the right (emulating a delay of the envelope), the cursor needs to be offset by a negative number.  This may seem a bit unintuitive at first, but we need to consider how the offset/placement of the cursor affects how the processing begins. i.e. if it is negatively offset, it will be delayed in processing the envelope data, thus shifting the envelope to the right.

Before moving on, here is a new video showcasing these new features, this time with some slightly more exciting audio for demonstration.

Lastly, in terms of bug fixing, I had to deal with an issue that arose in Soundforge due to the fact that it does not save the state of a plug-in once it’s invoked.  In other words, when a plug-in is opened in Audacity, it is not “destroyed” until the program is exited; it only switches to a suspend state while it’s not active.  This means that a plug-in’s constructor will only be called once, thus saving the state of parameters/variables between uses as long as the host isn’t closed.  Soundforge does not do this, however, and this caused problems with parameter and variable reinitialization.

Fortunately a fix was found, but it has further reaffirmed that VST is not really built for offline processing, and while we can certainly coax them into it, I’ve hit several limitations in terms of what I can accomplish within the bounds of the VST SDK as well as inadequate support for the offline capabilities it does provide by hosts.

As such, this might be one of the last entries in this particular making-of series on the Match Envelope plug-in.  I have learned a ton through this process, and through sticking it out when it became clear that this particular process is probably better suited to a standalone app or command-line program where I could have had much more control over things.

I look very much forward to developing my next plug-in, however, which will most assuredly be a real-time process of some kind, and I am excited to learn and to tackle the challenges that await!  Without further ado, here are links to the beta of the Match Envelope VST plug-in:

Match Envelope beta v0.9.2.23 — Mac

Match Envelope beta v0.9.2.23 — Win


The Making of a Plug-In: Part 4 (Alpha Version)

After all the work that’s gone into this plug-in, it’s pretty darn cool to see it up and running in a host program!  I’ve been messing around with it a bit in Audacity, just trying out different parameter setting and seeing it all come together.  There’s more work to be done in order to take care of some additional issues that may crop up, but for the most part I have a working alpha version of the Match Envelope plug-in.  I will likely graduate it to beta once I take care of these issues, which isn’t far off, but in the meantime I am providing the alpha version for anyone interested to give it a shot (links at the end of this post).  I also recorded a short little video demonstrating the basic functionality of the plug-in (again, these particular audio files were chosen to illustrate the effect visually):

One of the primary issues that I need to address is differing sample rates between the source and destination audio.  This is something I have yet to try out in code, but I expect some sort of scaling of certain parameters will solve the problem rather than forbidding mixed sample rates (which would be a bit of a pain).

If, for example, we choose a window size of 30ms with the source audio at a sample rate of 44.1kHz, it will contain 1323 samples per window.  If the destination audio’s sample rate is at 48kHz, however, it will contain 1440 samples per window.  There can thus quite easily be a mismatch in the number of windows in an envelope profile between the two audio streams as well as window boundaries not matching up, which will cause the audio files to not sync or match properly.

This brings up another related issue, evident in the UI.  The Envelope Extractor contains the parameter for window duration, but what if the user selects a different value for the source envelope and the destination?  Right now it results in an incorrect match.  Should this be forbidden, or perhaps even ignored by forcing the same window duration on both envelopes?  Or perhaps there is a way to turn it into a feature.  I am undecided on this at the moment and need to do some testing and exploring of various options before addressing this issue.

In addition, I will have to decide on whether implementing user-saved programs is of any use in this plug-in.  Currently this functionality is not supported, but I think it would be a good idea to include some obvious presets, and allow users to save their own.

Other than that, I am excited to share the first distributable version of this plug-in!  The Mac OS X version has been tested on 10.6.8 and 10.8 in Audacity, and the Windows version using Windows 7 in both Audacity and Soundforge.  If you’re interested, give it a try, and though you’re certainly not obligated to, feedback is welcome:


Edit: Beta version available in Downloads

The Making of a Plug-In: Part 3 (Solving the UI Issue)

I’m both happy and relieved that progress on making the Match Envelope plug-in is proving to be successful (so far, anyway)!  It’s up and running, albeit in skeleton form, on Audacity (both Mac and Windows) and Soundforge (Windows).  As I was expecting, it hasn’t been without it’s fair share of challenges, and one of the biggest has been dealing with the UI — how will the user interact with the plug-in efficiently with the inherent limitations involved in the interface?

The crux of the problem stems from the offline-only capability of the Match Envelope plug-in.  Similar to processes like normalization, where the entire audio buffer needs to be scanned to determine its peak before scanning it a second time to apply it, I need to scan the entire audio buffer (or at least as much as the user has selected) in order to get the envelope profile before then applying that onto a different audio buffer.

This part of the challenge I foresaw as I began development.  I knew of VST’s offline features, however, and I planned to explore these options that would solve some of the interface difficulties I knew I would encounter.  What I didn’t count on was that host programs widely do not support VST offline functions, and in fact, Steinberg has all but removed the example source code for offline plug-ins from the 2.4 SDK (I’m not currently up to speed on VST3 as of yet).  Thus I have been forced to use the normal VST SDK functions to handle my plug-in.

So here is the root cause of perhaps the main challenge I had to deal with: the host program that invokes the plug-in is responsible for sending the audio buffer in blocks to the processing function, which is the only place I have access to the audio stream.  The function prototype looks like this:

void processReplacing (float **inputs, float **outputs, VstInt32 sampleFrames)

inputs‘ contains the actual audio sample data that the host has sent to the plug-in, ‘outputs‘ is where, after processing, the plug-in places the modified audio, and ‘sampleFrames‘ is the number of samples (block size) in the audio sample data.  As I mentioned earlier, not only do I need to scan the audio buffer first to acquire the envelope profile, I need to divide the audio data into windows whose size is determined by the user.  It’s pretty obvious that the number of samples in the window size will not equal the number of samples in sampleFrames (at least not 99.99998% of the time), effectively complicating the implementation of this function three-fold.

How should I handle cases where the window size is less than sampleFrames?  More than sampleFrames?  More than double sampleFrames?  Complicating matters further is that different hosts will pass different block sizes in for sampleFrames, and there is no way to tell exactly what it will be until processReplacing() is invoked.  Here is the pseudocode I used to tackle this problem:

The code determines how many windows it can process in any given loop iteration of processReplacing() given sampleFrames and windowSize and storing leftover samples in a variable that is carried over into the next iteration.  Once the end of a window is reached, the values copied from the audio buffer (our source envelope) are averaged together, or its peak is found, whichever the user has specified, and that value is then stored in the envelope data buffer.  The reasoning behind handling large windows separately from small ones is to avoid a conditional test with every sample processed to determine if the end of the window is reached.

Once this part of the plug-in began to take shape, another problem cropped up.  The plug-in requires three steps (one is optional) taken by the user in order to use it:

  1. Acquire source envelope profile from an audio track,
  2. Acquire the destination audio’s envelope profile to use the match % parameter (optional),
  3. Select the audio to apply the envelope profile onto and process.

It became clear that, since I was not using VST offline capabilities, the plug-in would need to be opened and reopened 2 – 3 times in order to make this work.  This isn’t exactly ideal and wasn’t what I had in mind for the interface, but the upside is that its been a huge learning experience.  As such, I decided to split the Match Envelope plug-in into two halves: the Envelope Extractor, and the Envelope Matcher.

I felt this was a good way to go because it separated two distinct elements of the plug-in as well as clarifying which parameters belong with which process.  i.e. The match % or gain parameters have no effect on the actual extraction of the envelope profile, only during the processing onto the destination audio.  Myself, like many others I assume, like fiddling around with parameters and settings on plug-ins, and it can get very frustrating at times when/if they have no apparent effect, and this can create confusion and possibly thoughts of bad design towards the software.

To communicate between the two halves, I implemented a system of writing the envelope data extracted to a temporary binary file that is read by the Envelope Matcher half in order to process the envelope, and this has proven to work very well.  In debug mode I am writing a lot of data out to temporary debug files in order to monitor what the plug-in is doing and how all the calculations are being done.

From Envelope Extractor:

From Envelope Matcher:

Some of these non-ideal interface features I plan on tackling with a custom GUI, which offers much more flexibility than the extremely limited default UI.  Regardless, I’m excited that I’ve made it this far and am very close to having a working version of this plug-in up and running on at least two hosts so far (Adobe Audition also supports VST and as far as I know, offline processing, but I have not been able to test it as I don’t own it yet).

After this is done, I do plan on exploring other plug-in types to compare and contrast features and flexibility (AU, RTAS, etc.), and I may find a better solution for the interface. Of course, the plug-in could work as a standalone app where I have total control over the UI and functionality, but it would lack the benefit of doing processing right from within the host.