EDIT 2:
I realized another bit of complexity I didn’t explain and that also resulted from the switch to calling the GeneratePCMData function from the audio hardware thread. Since the callback is generated from an XAudio2 OnBufferEnd callback, if no buffer is submitted to the XAudio2 voice, then no more OnBufferEnd callbacks will be made since it no longer has any enqueued buffers. This means the procedural sound wave will just mysteriously fall silent. Therefore, the old paradigm of not returning audio if none is available won’t work. That was what many overrides of GeneratePCMData in certain use-cases were doing, in particular VOIP implementations or other things that depended on systems that may or may not have audio to generate. My base-class implementation of USoundWaveProcedural attempts to handle that case and will always return audio buffers even if no audio has been queued. It also attempts to wait until a certain amount of audio buffers (to “build up” audio buffers) before starting. This is to support streaming systems or VOIP streams that may not have a enough audio ready at first. The amount to wait until feeding audio out is configurable by NumSamplesToGeneratePerCallback. This also determines the general cost of the procedural sound wave. Larger NumSamplesToGeneratePerCallback will reduce CPU cost but increase latency (for real-time synthesis that gets param data from the game thread, this will mean that your synthesizer will respond more slowly to parameters, etc). Also, the larger the NumSamplesToGeneratePerCallback, the fewer OnSoundWaveProceduralUnderflow delegate callbacks will be made per GeneratePCMData callback.
Also, the amount of silent audio to write out in the case of real buffer underrun is also configurable with NumBufferUnderrunSamples. This is to decouple the amount of silent audio written out from the number of samples we normally generate per callback (i.e. you may want to reduce the size of the buffer underrun and ensure that the xaudio2 voice performs an OnBufferEnd callback faster or shorter for silent buffers than for audio-filled buffers).
Hey man, I’ve got some feed back here on NumSamplesToGeneratePerCallback. If I have a sample rate of 44100, and I wanted to generate 1 second of data I’ll have to queue audio in chunks the size of NumSamplesToGeneratePerCallback. If I want to put 44100 as NumSamplesToGeneratePerCallback, I can’t. There seems to bee a glass limit around ~8200 samples at the highest I can go. Is this intentional? Because at 8000 samples or so, that’s not enough to hear if I call less than once every 1/5 second.
I’m doing some more testing right now, but It’s pretty confirmed that there is a limit on the NumSamplesPerCallback I can set it to.