Real-time Onset Detection using Spectral Difference and Linear Prediction.

Real-time Onset Detection in Unreal Engine 4!
My algorithm uses Fast-Fourier-Transforms (FFT) to analyse the frequency spectrum of the input audio signal. It detects so called ‘onsets’, somewhat similar to the ‘beat’ you clap or dance to when you hear music.
First part of the video shows a simple kick drum test, second part a fairly complex track:

Orange Graph: Onsets -> x-axis: audio position, y-axis: onset amplitude;
Green Graph: Threshold -> x-axis: audio position, y-axis: threshold amplitude.

The algorithm calculates the spectral difference of the frequency bin spectra from an incoming real signal audio buffer frame, applies Linear Predictions to predict the next, not yet buffered value, to increase onset precision, and uses a weightend threshold to finally determine if there was an onset in the current processed frame.
The computations are fairly expensive for real-time usage but it works pretty much in-sync with the playing audio.

Next step is to use those onsets to drive materials and particle systems!

Music (if interested) was created by myself:
LauchBeats - Anomaly

(Stream LauchBeats music | Listen to songs, albums, playlists for free on SoundCloud)

Hey people,

here is an update of my real-time Onset Detection project.

It now is GPU accelerated! I’m using Nvidia’s Cuda and its cuFFT library to calculate my STFTs (short-time Fourier Transforms).

I testetd it with a window size of 4096 samples and a hop size to the next window of 512 samples. So the windows are overlapping by 87.5%, which might be to much for good results but this was for performance testing :smiley:

Since the sample rate of the audio signal was 44100 Hz or S/s, the algorithm processes

44100 / 512 * 4096 = 352800 S/s (samples per second) !

All this with a framerate of ~95 frames!

And I added some debug features.

#OnsetDetection #Unreal #Nvidia #CUDA #GPGPU #FFT #hashtag

Here it is, a visualization example of my real-time onset detection/sound analysis project.
I brought some visual feedback and brutality to my project
https://static.xx.fbcdn.net/images/emoji.php/v9/f51/1/16/1f603.png
:smiley:
I’m using spectral difference and linear prediction to find so called onsets from the audio input.
FFTs to calculate the frequency spectra are GPU accelerated using Nvidia’s CUDA.
Frequencies are mapped to a color range and react to onsets, different frequency bands and amplitudes.
All procedural, all math and all real-time.
Animations are material driven, using world position offset etc.
This is only a small visualization, considering all the stuff that happens in the background (code-wise)…
You might see some small delays, those are due to video encoding and not the application.
Music:
UKF Dubstep Tutorial

I’m working on something like this also and would be interested to see your implementation!

Interesting, how does your implementation differ from kwstasg/WAC’s solution?

Windows Audio Capture

Pretty sure he uses the same PlugIn aan Epic employee developed and eXi who made it usabel. However both of these don’t work right and have a lot of flaws. I started with them as foundation rewrote it and fixed all issues. In addition, like stated above, I use Linear Predictiopn and Spectral Difference to find Onsets in the song. This is not only converting audio from time signal to frequency signal but also apply recent detection algorithms. Lastly I use CUDA to accelerate my FFT calculations.

Hey can you give any tips on how you fixed those other plugins, specifically using the live audio capture input? The Windows Live Audio Capture creates noise/errors randomly. I think I’ve narrowed it down to the sample pointer not always being aligned.

Specifically this part:



// Take the next Sample - Halfed Causes SamplePtr Exception
if (SamplePtr != NULL && SamplePtr != nullptr && (SampleIndex + FirstSample < SampleCount /2))

As you can see the comments left indicated an exception access violation crash if you try and use all of the sample pointer/audio chunks.
But using the plugin as is, the output randomly turns to garbage. I’ve messed around with it enough to get to a point where it’s stable for over a minute before eventually the sample pointer causes a crash.

Any tips on what you fixed or maybe even a github of the first demo you made without cuda, if you still have it? I don’t need the beat detection and I don’t have cuda anyways. I just been trying to fix this for awhile now.

Edit: Well I at least fixed the bigger issue with noise/garbage and the output is generally clean(but low resolution). I would still love to hear any general tips or code you’re willing to share!

Wow, can you share your project(or plugin) file for this? :slight_smile:

I tried all the other plugins and they’re all outdated or cause the game to crash.

hey, man! Amazing plugin, is there anyway to check it? Do you have some kind of demo to see how it works in our project? Or maybe do you sell it somewhere?) Thanks in advance!

1 Like