I’m both happy and relieved that progress on making the Match Envelope plug-in is proving to be successful (so far, anyway)! It’s up and running, albeit in skeleton form, on Audacity (both Mac and Windows) and Soundforge (Windows). As I was expecting, it hasn’t been without it’s fair share of challenges, and one of the biggest has been dealing with the UI — how will the user interact with the plug-in efficiently with the inherent limitations involved in the interface?
The crux of the problem stems from the offline-only capability of the Match Envelope plug-in. Similar to processes like normalization, where the entire audio buffer needs to be scanned to determine its peak before scanning it a second time to apply it, I need to scan the entire audio buffer (or at least as much as the user has selected) in order to get the envelope profile before then applying that onto a different audio buffer.
This part of the challenge I foresaw as I began development. I knew of VST’s offline features, however, and I planned to explore these options that would solve some of the interface difficulties I knew I would encounter. What I didn’t count on was that host programs widely do not support VST offline functions, and in fact, Steinberg has all but removed the example source code for offline plug-ins from the 2.4 SDK (I’m not currently up to speed on VST3 as of yet). Thus I have been forced to use the normal VST SDK functions to handle my plug-in.
So here is the root cause of perhaps the main challenge I had to deal with: the host program that invokes the plug-in is responsible for sending the audio buffer in blocks to the processing function, which is the only place I have access to the audio stream. The function prototype looks like this:
void processReplacing (float **inputs, float **outputs, VstInt32 sampleFrames)
‘inputs‘ contains the actual audio sample data that the host has sent to the plug-in, ‘outputs‘ is where, after processing, the plug-in places the modified audio, and ‘sampleFrames‘ is the number of samples (block size) in the audio sample data. As I mentioned earlier, not only do I need to scan the audio buffer first to acquire the envelope profile, I need to divide the audio data into windows whose size is determined by the user. It’s pretty obvious that the number of samples in the window size will not equal the number of samples in sampleFrames (at least not 99.99998% of the time), effectively complicating the implementation of this function three-fold.
How should I handle cases where the window size is less than sampleFrames? More than sampleFrames? More than double sampleFrames? Complicating matters further is that different hosts will pass different block sizes in for sampleFrames, and there is no way to tell exactly what it will be until processReplacing() is invoked. Here is the pseudocode I used to tackle this problem:
The code determines how many windows it can process in any given loop iteration of processReplacing() given sampleFrames and windowSize and storing leftover samples in a variable that is carried over into the next iteration. Once the end of a window is reached, the values copied from the audio buffer (our source envelope) are averaged together, or its peak is found, whichever the user has specified, and that value is then stored in the envelope data buffer. The reasoning behind handling large windows separately from small ones is to avoid a conditional test with every sample processed to determine if the end of the window is reached.
Once this part of the plug-in began to take shape, another problem cropped up. The plug-in requires three steps (one is optional) taken by the user in order to use it:
- Acquire source envelope profile from an audio track,
- Acquire the destination audio’s envelope profile to use the match % parameter (optional),
- Select the audio to apply the envelope profile onto and process.
It became clear that, since I was not using VST offline capabilities, the plug-in would need to be opened and reopened 2 – 3 times in order to make this work. This isn’t exactly ideal and wasn’t what I had in mind for the interface, but the upside is that its been a huge learning experience. As such, I decided to split the Match Envelope plug-in into two halves: the Envelope Extractor, and the Envelope Matcher.
I felt this was a good way to go because it separated two distinct elements of the plug-in as well as clarifying which parameters belong with which process. i.e. The match % or gain parameters have no effect on the actual extraction of the envelope profile, only during the processing onto the destination audio. Myself, like many others I assume, like fiddling around with parameters and settings on plug-ins, and it can get very frustrating at times when/if they have no apparent effect, and this can create confusion and possibly thoughts of bad design towards the software.
To communicate between the two halves, I implemented a system of writing the envelope data extracted to a temporary binary file that is read by the Envelope Matcher half in order to process the envelope, and this has proven to work very well. In debug mode I am writing a lot of data out to temporary debug files in order to monitor what the plug-in is doing and how all the calculations are being done.
From Envelope Extractor:
From Envelope Matcher:
Some of these non-ideal interface features I plan on tackling with a custom GUI, which offers much more flexibility than the extremely limited default UI. Regardless, I’m excited that I’ve made it this far and am very close to having a working version of this plug-in up and running on at least two hosts so far (Adobe Audition also supports VST and as far as I know, offline processing, but I have not been able to test it as I don’t own it yet).
After this is done, I do plan on exploring other plug-in types to compare and contrast features and flexibility (AU, RTAS, etc.), and I may find a better solution for the interface. Of course, the plug-in could work as a standalone app where I have total control over the UI and functionality, but it would lack the benefit of doing processing right from within the host.