Audio, Music

De-Noising Audio using Spectral Subtraction in MATLAB and Ableton Live

Last time I wrote about audio restoration using simple digital filtering (in MATLAB and Ableton Live). I’ve since received another old Havering recording from Walt. Again from an old cassette tape, this recording is rather noisy. In this post, I explain how I cleaned it up using a more elaborate technique than previously.

Again I used MATLAB for the algorithm development aspects of the process, in combination with Ableton Live for the audio and mix management.

The noise

Here is a clip of the lead-in to the show. The noise is apparent.

Snippet of the raw (noisy) recording

Figures 1 and 2 show the noise spectrum (over the full bandwidth and zoomed-in to the low-frequency zone, respectively) computed via the MATLAB pspectrum function.

Figure 1: Noise spectrum revealing the broadband nature of the background noise in the recording.
Figure 2: Noise spectrum, zoomed-in on the low-frequency regime, revealing the 60 Hz “power hum” plus a distinct peak around 1150 Hz in both channels and a lesser peak around 1700 Hz in the left channel only.

The noise has similar characteristics to the last time: some low-frequency “power hum” (Figure 2) plus a broad-band “tape hiss” over the extent of the audio/music bandwidth (Figure 1). Interestingly, the low-frequency power hum (Figure 2) comprises only the fundamental mode (at approximately 60 Hz) rather than the multiple harmonics observed last time. Also, there is a distinct peak around 1150 Hz in both channels and a lesser peak around 1700 Hz in the left channel only.

Suppressing the “power hum”

As last time, notch filtering was used to suppress the low-frequency peaks from Figure 2. However, rather than using Ableton Live’s notch filtering as I did last time, I used MATLAB. This allowed me to create a suite of filters which could be separately configured for the left and right channels (since as observed in Figure 2, the characteristics of the noise peaks varies between the channels). As a starting point, I used the MultiNotchFilter example “plugin” bundled with the MATLAB Audio Toolbox and extended it to have separate controls for each channel (creating what I call the MultiNotchFilterStereo “plugin”). Figure 3 shows a (partial) screenshot of the plugin configured to suppress the peaks identified in the spectrum from Figure 2.

Figure 3: Screenshot of the MultiNotchFilterStereo plugin (adapted from the MultiNotchFilter plugin bundled with MATLAB) loaded into the MATLAB audioTestBench. The plugin has ten notch filters per channel. Only the first seven of the left channel filter controls are visible in the screenshot (there are similar controls for each of the ten filters per channel). Only three of the notches are being used on the left channel (and only two on the right channel), corresponding to the three noise peaks (at 55 Hz, 1136 Hz, and 1702 Hz) in the left channel (and 55 Hz and 1168 Hz for the right channel).

Here is the result of applying the notch filtering to the original noisy clip:

Result of applying the notch filtering to the snippet of the raw (noisy) recording in order to suppress the low-frequency noise components. Comparing with the raw clip presented earlier, it is clear that the filters have had an audible effect on suppressing some of the components of the noise.

Suppressing the “tape hiss”

Instead of simple filtering used last time, I wanted to try something more sophisticated in an attempt to achieve improved broad-band noise suppression with minimal audible artefacts.

The approach adopted was to adapt the SpectralSubtractor “plugin” bundled with the MATLAB Audio Toolbox, again extended to have separate processing for each channel (creating what I call the SpectralSubtractorStereo “plugin”) since the original plugin catered for mono signals only. Figure 4 shows a screenshot of the plugin configured (by trial-and-error listening experiments) to suppress the broadband noise identified in the spectrum from Figure 2.

Figure 4: Screenshot of the SpectralSubtractorStereo plugin loaded into the MATLAB audioTestBench. The plugin (adapted from the SpectralSubtractor plugin bundled with MATLAB) performs noise reduction by spectral subtraction, applied independently to both channels, but with the same user-configurable parameters configured on both channels.

The algorithm works by subtracting a representation of the noise from the noisy signal in the frequency domain. In this case, the representation of the noise is a simple constant amplitude (band-limited) “white noise” model.

The core of the algorithm is encapsulated in the first line of the following two lines of MATLAB code:

mag_X_out = max (0, abs(X_in)-Mag2Subtract);

X_out = mag_X_out.*exp(li*angle(X_in));

where mag_X_out is the magnitude of the processed spectrum, X_in is the noisy signal spectrum, and Mag2Subtract is the user-selected “noise magnitude” (i.e., configured via the the “Noise Estimate” control in Figure 4). In the second line of code, X_out is the processed spectrum created by reuniting the modified magnitude mag_X_out with the original phase of X_in.

Not shown in this code snippet is the application of the Fast Fourier Transform (FFT) and its inverse — to convert to/from the frequency/time domains — nor have I included the machinery for managing the data buffers, since I wanted to emphasise the crux of the algorithm (rather than the utility code around it) — and moreover, I wanted to demonstrate how compact the MATLAB language is for implementing mathematical expressions applied to complex-valued matrices (such as X_in and X_out).

A schematic illustrating the spectral subtraction technique is shown in Figure 5.

Figure 5: De-noising via the technique of spectral subtraction. The plots are in the frequency domain (i.e., after the FFT computation). Note that these are not actual signal spectra, merely pictorial representations to aid the explanation. Also, just a single-channel (mono) signal is depicted here (in the actual processor, the same algorithm is applied independently to each channel). The number of frequency bins (and hence the frequency resolution for a given sample-rate) is determined by the length of the analysis frame (i.e., the number of samples, per channel, sent to the FFT in each successive computation, performed frame-by-frame over the entire signal duration), adjusted via the “Analysis Frame” control in Figure 4. The “Noisy signal” (blue) in the upper plot corresponds to abs(X_in). The “Noise model” (red) corresponds to Mag2Subtract. The “De-noised signal” (green) in the lower plot corresponds to mag_X_out. It has the value zero whenever the “Noisy signal” is below the level of the “Noise model”. Elsewhere, it has the value given by (abs(X_in) minus Mag2Subtract).

In a sense, the “0” branch in the expression for mag_X_out in the code snippet can be thought of as a frequency-dependent noise gate, whereby for each frequency bin, if the spectral magnitude is below the user-selected threshold (i.e., the “white noise” magnitude), the signal output is cut completely. For the other branch, if the spectral magnitude is above the assumed model noise threshold, then that constant threshold level (representing the “white noise” magnitude) is subtracted from each bin.

The noise threshold is user-adjusted by trial-and-error. Too low, the de-noising is not effective. Too high, and audible artefacts appear in the output as a characteristic “tinkling”. This invariably occurs when frequency-domain audio manipulation is pushed too far. Indeed, it can be used as an effect in itself e.g., vocoders and robotic voices, or in the (well-established) technique of cranking up autotune to the extreme. But for the present purposes of de-noising, the parameters have been adjusted such that maximal noise suppression is achieved with minimal perceivable adverse effects on the output signal. Note that the “Analysis Window” (i.e., the type of windowing used before performing the FFT), the “Analysis Frame” (i.e., the length of the data chunk sent to the FFT), and the “Frame Overlap” are commonly-used in spectral analysis (as described in many references, so not detailed here). Suffice it to say, for present purposes, these parameters were selected by trial-and-error (via subjective listening experiments) to give the best result on the audio file in question.

Here is the result of applying the spectral subtraction to the noisy clip using the settings displayed in Figure 4:

Result of applying the spectral subtraction to the previous clip (i.e., the one with the power hum already removed). Comparing with the original raw clip presented at the start, it is clear that the spectral subtraction algorithm is very effective for suppressing the broad-band noise. There is a little bit of “tinkling” evident in the output, but this is effectively masked by the music (once it starts playing).

“One click” plugin creation

Having built and tested the MultiNotchFilterStereo and the SpectralSubtractorStereo “plugins” entirely within the MATLAB environment, I then converted each of them to VST plugins using the “one click” conversion button provided in the MATLAB Audio Toolbox audioTestBench interface.

Additional tweaks to the mix within Ableton Live

I then loaded the VST plugins into Ableton Live, applied a noise gate in front of them, and some equalisation and dynamic range control downstream, as shown in the screenshot in Figure 6.

Figure 6: End-to-end plugin effects chain implemented in Ableton Live for this de-noising project. The first (“Short Cut” noise gate) and last (“Punchy Dance Master” compressor/limiter/equaliser component) are Ableton built-in plugins used to tweak the mix. The middle two components (“MultiNotchFilterStereo” and “SpecralSubtractorStereo”) are the VST plugins built entirely in MATLAB and are the core of the de-noising solution presented in this article.

This effects chain was applied to the noisy recording of the entire radio show. The resulting cleaned-up audio can be streamed from here.

Conclusions

The spectral subtraction method, using a simple flat “white noise” model, is found to be rather effective in removing broad-band “tape hiss” noise from audio/music recordings. Compared with simple digital filtering (covered in the previous post), the spectral subtraction method is found to be superior (from informal subjective listening trials).

As an enhancement of the technique, it would be interesting to try subtracting a shaped noise spectrum (rather than the simple flat value used here). This could be computed from a noise-only portion of the recording. Likewise, it would be interesting to compare the spectral subtraction approach with alternative techniques such as wavelet-based de-noising, machine-learning/deep-learning based de-noising, and adaptive filtering. All these can be explored via MATLAB.

MATLAB is again found to be a very powerful and convenient environment for prototyping the audio processing algorithms. Moreover, the (remarkable) “one click” creation of VST plugins from entirely within MATLAB makes it trivially simple to bring the algorithms into the Digital Audio Workstation (DAW) universe.

Footnote

You may have noticed this logo in the compiled MATLAB VST plugin screenshots above. There is a history to this. Just over twenty years ago, I worked with a very talented programmer, Pepijn Sitter, from The Netherlands, to create an audio effects processing software product called WaveWarp. We distributed it under the trading name Sounds Logical. It was critically acclaimed, winning an Editor’s Choice Award from Electronic Musician Magazine in 2001.

WaveWarp enabled you to build your own audio effects from a library of modular building blocks. In that sense, it’s architecture resembled Simulink, but was fundamentally much faster (even compared with the compiled version of Simulink deployed via the RealTimeWorkshop) on account of the fact that the WaveWarp audio engine (and each individual module) was written in highly-optimised C code (making extensive use of pointer arithmetic) such that it could process multi-channel audio in real-time, sample-by-sample, on a typical desktop PC of the age. Moreover, it had full multi-rate functionality (via a library of decimators, interpolators, polyphase filterbanks, etc) allowing for elaborate mixed sample-rate designs. It used the FFTW (Fastest Fourier Transform in the West) library for spectral analysis, just as MATLAB does now. The WaveWarp software worked in standalone mode or as a DirectX plugin, and even had a real-time interface to MATLAB (akin to the audioTestBench available in the MATLAB Audio Toolbox today).

Alas, WaveWarp is now long gone. Moreover, I lost track of the source-code years ago, and I don’t have a running version. Also, it has almost completely faded from the internet. I could find only this review on PCRecording.com.

Anyway, given that I find myself delving into the world of audio processing again, I thought it fitting to revive the logo.

Standard
Audio, Music

Basic audio restoration using Ableton Live and MATLAB

Walt, the drummer from The Havering, just sent me an mp3 file of a Havering recording from a Stanford College Radio show in 1989. The mp3 file was created from the original recording on a thirty year old cassette tape, so the quality is not fantastic. The aim here is to clean it up and publish it on The Havering song archive.

My Digital Audio Workstation (DAW) of choice when working with audio clips and samples is Ableton Live which is the main environment I’ll use for this mini-project.

This project also presents a good opportunity to test drive the MATLAB Audio Toolbox.

Restoring the audio involves multiple stages, much of which is trial-and-error. Foremost is noise removal.

Noise Removal

Here is the start of the first song (“Trust”). The background noise is rather apparent during the non-music lead-in, continuing into the music:

Snippet of the raw (noisy) recording

Helpfully, because this is a recording of a live radio show, there are lulls in the music where only the noise is present. For example, here is the snippet of noise from the non-music lead-in (amplified for emphasis):

Just the noise lead-in from the previous snippet (amplified)

The first step in removing or suppressing the noise is to try and gain an understanding of it. Since we have the noise-alone snippet, we can analyse it in isolation (this isn’t always the case: often we only have the music-plus-noise available. But we are lucky here). Loading the noise file into MATLAB (via the audioread function) and utilising the pspectrum function to generate the noise spectrum yields the plot displayed in Figure 1:

Figure 1: Noise spectrum revealing the broadband nature of the background noise in the recording.

This is a “textbook” example of broadband noise whereby the power spectrum is effectively uniform over the frequency range of interest (i.e., over the audio range from 20 Hz to 20 kHz, approximately). It does drop off dramatically around 17 kHz or so, but even so, the noise level is effectively constant (and high) over the audio/musical range of interest, and so will be quite tricky to deal with. Listening to the noise, it appears to be classic “tape hiss”, prevalent in analogue recordings such as the cassette tape used in this recording.

It is helpful to zoom-in on the low-frequency portion of the chart and view on a log-scale, as displayed in Figure 2.

Figure 2: Noise spectrum, zoomed-in on the low-frequency regime, revealing the 60 Hz “power hum” and its harmonics

There is a series of distinct peaks. Using the MATLAB findpeaks function reveals these to be at the following frequencies (averaged across both channels): 60 Hz, 120 Hz, 180 Hz, 240 Hz, 300 Hz, 430 Hz, and 680 Hz. The majority of these (60, 120, 180, 240, and 300 Hz) are classic “power hum” (fundamental mode plus four harmonics) from the AC power supply (the recording was made in California, US, where the power-grid AC fundamental frequency is 60 Hz — rather than 50 Hz in the UK).

Suppressing the “power hum”

Since the frequencies are well-defined for the low-frequency “power hum” components of the noise, this suggests utilising a bank of notch filters tuned to each mode of the noise (i.e., to “notch out” each noise component). Ableton Live has a built-in 8-band equalizer which can be used for this purpose. See the screenshot in Figure 3 below where the equalizer has been configured as required.

Figure 3: Ableton Live equalizer component configured with multiple notch filters tuned to suppress the “power hum” harmonics from Figure 2.

Below are the “before” and “after” audio clips. The notch filtering is effective at removing the “power hum”. Note: with these compressed mp3 snippets in this blog article, the low frequencies are suppressed by the mp3 encoding algorithm, so you may have to turn the volume up to hear the difference. Even then, it may be difficult to perceive the differences, though they are readily apparent in the uncompressed WAV files in Ableton and MATLAB.

“Before”: snippet of the raw (noisy) recording (from earlier)
“After”: snippet after processing to remove the “power hum”

Suppressing the “tape hiss”

The simplest approach to suppress the remaining tape hiss (now that the hum has been successfully removed) is to implement digital filtering to target the frequencies where the noise is most apparent to human hearing. In future I may experiment with more sophisticated techniques (e.g., STFT-thresholding, wavelet-transform-thresholding, Deep Learning, adaptive filtering, etc).

But for now, my approach is to design a digital filter with the aim of suppressing the noise (as perceived by a human listener) as far as possible without adversely affecting the music to a significant extent. There will inevitably be a trade-off between these competing goals.

I could continue with Ableton’s built-in filters to experiment with filter design, but for demonstration purposes I’ll switch over to MATLAB which has an extensive library of digital filter design algorithms (via the Signal processing Toolbox and the DSP System Toolbox) which can be brought to bear. Additionally, the Audio Toolbox has real-time audio streaming capabilities which enable the algorithm-under-test to be inserted in a real-time stream to/from audio files or devices or both.

After some trial-and-error , I settled on a high-frequency band-stop filter. Moreover, I selected an algorithm which happens to be provided as one of the out-of-the-box plugin examples (namely, the “Shelving Equalizer”) bundled with the MATLAB Audio Toolbox in order to demonstrate those capabilities.

Figure 4 contains a screenshot of the Shelving Equalizer loaded into a MATLAB audioTestBench which I’ve configured to stream data from a source audio file, through the filters, and out to the audio interface (in this case, a Focusrite Scarlett 2i4 soundcard with ASIO drivers). I manually adjusted the filter parameters by trial-end-error on-the-fly whilst listening to the processed audio in real-time. Note that the low-frequency filter is disabled (by setting its gain to 0 dB).

Figure 4: The audioTestBench utility from the MATLAB Audio Toolbox configured with the Shelving Equalizer with its parameters tuned to suppress the high frequency “tape hiss” (the low-frequency filter is disabled).

Below are the “before” and “after” audio clips (in this case, “before” is not the original raw file, but rather the file with the hum removed from the previous step in the process). As can be heard, the filtering is effective at removing the high-frequency “tape hiss” (again, with these mp3 snippets, you may have to turn the volume up to hear the difference). There is nevertheless some noise remaining in the mid-frequency range which I was not able to filter out without adversely affecting the music.

“Before”: snippet with “power hum” removed (from earlier)
“After”: snippet after further processing to remove the high-frequency “tape hiss”

One-click plugin

A very useful feature of the MATLAB Audio Toolbox is the ability to create a VST plugin from an algorithm prototyped in MATLAB, by clicking a single button. For example, I converted the Shelving Equalizer into a VST plugin by clicking the “generate VST Plugin” button located on the audioTestBench graphical-user-interface. By copying the resulting dll into Ableton’s plugin folder, the Shelving Equalizer becomes available from within Ableton Live, as illustrated in the screenshot in Figure 5 below. This allowed me to process the “tape hiss” via the MATLAB filter design, without having to bring the audio tracks out of Ableton. A considerable convenience.

Figure 5: Shelving Filter designed in MATLAB (see Figure 4), then converted to a VST plugin (via one mouse-click in the MATLAB audioTestBench), and imported to Ableton Live.

Noise Gate

Being a recording of a radio show, there are many quiet intervals between songs (e.g., when the band is introducing the next song, or the DJ is chatting, etc). It is during these lulls that the (remaining) noise is most apparent — and distracting. A simple technique to minimise this distraction is to use a Noise Gate to cut-out the audio when the volume falls below a given threshold. Then, when the music volume increases to performance levels, the music effectively masks the noise. This is a handy consequence of psychoacoustics: even though the noise is still there, we don’t perceive it to be at the same distracting level as we do during the lulls in the music.

Rather than simply deploying a noise gate, we can utilise a clever trick as described in this article. The trick is summarised as follows: (i) make a duplicate of the original noisy track, and keep the original aside for the moment; (ii) reverse the phase of each channel in the duplicate (i.e., multiply the amplitude of every sample by -1). Now, when played together, (i)+(ii) results in complete cancellation and total silence. That’s okay; (iii) pass the phase-reversed channel from (ii) through an inverted noise gate with its upper-and-lower thresholds configured such that only the noise passes through when the music volume is low, and nothing passes through when the music volume increases; (iv) play the original noisy track (i) together with the inverse-gated phase-reversed track (iii). The end result is complete silence during the lulls in the music. Away from the lulls, when the music is playing, the noise is still present, but the distracting noise at low music volumes is completely eliminated, giving the overall impression that the noise has been removed throughout (even though it actually hasn’t). This approach is a simplistic implementation of the technique of active noise cancellation (insofar as it utilises destructive interference of the noise waveform, albeit on the noise-only segments of the track, though without a separate noise measurement and adaptive filtering continually correcting the entire track).

Figure 6 contains a screenshot of Ableton’s built-in phase-reverser and inverted noise gate where the respective parameters have been specifically tuned (by trial-and-error) to implement (iii) on the noisy music recording in question.

Figure 6: Ableton Live’s phase reverser and noise gate (with “flip” enabled to invert the gate’s behaviour), with the thresholds tuned to allow only the noise to pass through the gate. When the music level rises, nothing passes through. The phase-reversed gated signal is added to the non-gated original phase signal such that the noise is totally cancelled at low levels e.g., in the lulls between songs.

Additional tweaks to the mix

Before applying the noise removal process, I reduced the overall dynamic range of the entire track by passing it through a compressor to suppress the peaks. Figure 7 contains a screenshot of the built-in Ableton compressor with appropriate settings for The Havering track (adjusted by trial-and-error).

Figure 7: Ableton Live’s built-in compressor applied to reduce the dynamic range of the original file before application of the de-noising algorithms.

I then applied the aforementioned de-noising processes, after which the resulting track seemed a little “lacking in body” compared with the original. To bring it back to life, I deployed a penultimate stage of filtering (equalisation): specifically, utilising Ableton’s built-in equaliser with its “Dance Master” configuration preset, inserted before the MATLAB-based Shelving Equalizer, as shown in the screenshot below in Figure 8. I also adjusted the overall gain of the final mix to maximise the available volume.

Figure 8 Equalization applied via Ableton Live’s built-in EQ to “revive the body” of the de-noised audio before application of the MATLAB-based Shelving Equalizer.

The final result

Original mp3 noisy recording of the entire radio show (approximately 24 minutes runtime)
Processed mp3 recording of the entire radio show after all stages of restoration have been applied (approximately 21 minutes runtime since silent lulls between songs have been removed)

In my opinion, comparing the noisy track with the cleaned-up track, the restoration has been a success. But it is subjective, so judge for yourself.

Here is the cleaned-up recording on Bandcamp where you can retrieve it in uncompressed FLAC format (better quality than mp3).

Conclusions

Basic digital filtering techniques have been shown to be somewhat effective for removing noise from an mp3 file of a live music recording transcribed from an old cassette tape, with minimal perceptible distortion of the underlying music signal.

The use of a digital audio workstation (e.g., Ableton Live) plus MATLAB is found to be a powerful combination in terms of extensive algorithmic capabilities and ease-of-workflow.

The ability to effortlessly create a VST Plugin from within the MATLAB Audio Toolbox is remarkable and very useful.

All of the mp3 audio snippets presented in this post were created using the MATLAB audiowrite function which supports such export. Another considerable convenience. By contrast, Ableton Live (at least Version 9 which I’m using) does not support mp3 export (!)

It would be interesting to compare the simple approach presented here with more advanced noise-processing techniques (as alluded to earlier), and with commercially-available 3rd party de-noising plugins (such as the much-acclaimed Rx 7).

Footnote

After the concert, the organisers (Amnesty International) sent us a letter thanking us. Here is the letter. It was nice to receive it. The closing sentence makes mention of the very cassette tape used in this restoration project.

All audio content presented in this post is copyright The Havering 1989–2020, all rights reserved.

Standard
Audio, Music

Drummer Found

Last night I re-connected via WhatsApp with Walt Fulde, the drummer from The Havering. Walt happened across The Havering page on Bandcamp, as per my previous post.

Fortunately, Walt has previous form when it comes to responding to bulletin boards. Below are the original posters we displayed across the Stanford Campus when seeking a drummer almost thirty years ago. Walt responded, completing our four-piece. Zooming forward three decades, we will now embark on some transatlantic musical collaborations. Stay tuned.

Standard
Audio, Music

The Havering

I came across some old recordings of The Havering, a band I played in whilst at Stanford University in California. We were active from 1989 to 1991. Perhaps it could be claimed that the songs have aged reasonably well, or then again, maybe they haven’t ? Judge for yourself, if you so wish. All songs can be streamed from here (via BandCamp, a well-organised resource for musicians, took me only a couple of hours to post the songs up there today).

Standard