Switching from XAudio2 to AudioMixer broke our custom left/right panning for voice over IP in USoundWave::GeneratePCMData

Our game has voice over IP implementation. We have a specific voice chat feature to let players pan other players’ voices left, right or in-between depending on user settings. That feature no longer works after switching to AudioMixer, there is no hard left/right panning anymore.

The way we used to pan the voice is:

  • We have a custom class VoiceOutput derived from USoundWave
  • We override USoundWave::GeneratePCMData in VoiceOutput
  • Inside of GeneratePCMData, we manually write the values to the output buffer and multiply those values by left or right channel multiplier depending on where the voice should be coming from. So if we pan all the way to the left, left multiplier = 1 and right multiplier = 0

All the multipliers are correct. PCM data is multiplied correctly, assuming the sample configuration is read as RLRLRL.

My main questions are:

  • Did anything change in how we generate and deal with PCM data between XAudio2 and AudioMixer? We are a little suspicious we are multiplying the wrong data by the left and right multipliers (like maybe the channels are no longer encoded as RLRLRL?), but it’s hard to tell.
  • Does PCM data get overridden somewhere later in the engine code after we call USoundWave::GeneratePCMData? If you have any insights into PCM data lifecycle before/after it’s generated, that would be very helpful.

I’ve provided a truncated version of the code in case it is helpful, this is the version that worked with XAudio2 but doesn’t work with AudioMixer:

`int32 UVoiceOutput::GeneratePCMData(uint8* PCMData, const int32 SamplesNeeded)
{
// Number of samples we “consumed” from the audio buffer
int32 OurSamplesConsumed = 0;
// Number of samples we actually wrote to PCMData
int32 OutputSamplesWritten = 0;

if (NumChannels == 2)
{
// StereoBias is a value between -1 and 1. It’s set outside of this class depending on user settings
float RightMultiply = FMath::Min(-StereoBias + 1.f, 1.f);
float LeftMultiply = FMath::Min(StereoBias + 1.f, 1.f);

// Code above (not pictured) will acquire multiple buffers of audio and put them in OutputBufferQueue
// We will go through most of them in this while loop until we reach the number of audio samples needed
while (BuffersUsed < OutputBufferQueue.Num() && OutputSamplesWritten < SamplesNeeded)
{
TSharedPtr<TArray> CurrentBuffer = OutputBufferQueue[BuffersUsed];
//The ptr to the PCMData. Must be updated since we sequentially copy float buffers.
int16* PCMDataWithOffset = ((int16*)PCMData) + OutputSamplesWritten;

// The number of audio samples we should consume from our CurrentBuffer
const int32 NumOurSamplesToConsume = FMath::Min((SamplesNeeded / 2) - OurSamplesConsumed, CurBuffer->Num());

//The number of samples we ACTUALLY create.
//Should be twice the NumOurSamplesToConsume, since this is for both left and right channels.
const int32 NumOutputSamplesToCreate = NumOurSamplesToConsume * 2;

//Copy data from our buffers to PCM with correct panning applied
int32 OutputBufferIndex = 0;// to track the output buffer size
for (int32 SampleIndex = 0; SampleIndex < NumOurSamplesToConsume; ++SampleIndex)
{
//Copy the data into the output buffer for both channels
// ConvertFloatToPCM just multiplies the value by 32768 and clamps it in range -32768,32768
PCMDataWithOffset[OutputBufferIndex++] = ConvertFloatToPCM(CurBuffer->GetData()[SampleIndex] * RightMultiply);//Right stereo write
PCMDataWithOffset[OutputBufferIndex++] = ConvertFloatToPCM(CurBuffer->GetData()[SampleIndex] * LeftMultiply);//Left stereo write
}

OurSamplesConsumed += NumOurSamplesToConsume;//The amount of samples we have consumed from the buffer.
OutputSamplesWritten += NumOutputSamplesToCreate;//The amount of output samples we have actually written.

// code to remove processed stuff from the buffer queue goes here, at the end of the while loop
}
}
}`

Any insights are appreciated, thank you!

It looks like the audio you are generating is correct if your goal is to hard pan. You may be running into an issue around spatialization. There is a flag on the attenuation settings for controlling whether the audio is spatialized or not. Since you are doing your own spatialization, make sure that it is unchecked.

The AudioComponent allows for attenuation setting override, but there are also attenuation settings on the sound asset itself. I’m assuming that `UVoiceOutput` is deriving from `USoundWaveProcedural`. On that asset you’ll want to set the `bSpatialize` to false.

Let me know if that doesn’t fix the issue. The `GeneratePCMAudio` is an old API and there’s a possibility that it’s functionality regressed.

Hi Phil, thank you for your suggestion. It took our team a while to come back to this problem and test it out.

Our goal is indeed to hard pan. We discovered we turn off spatialization like this:

AudioComponent = AudioDevice->CreateComponent(…)

AudioComponent->bAllowSpatialization = false

and we don’t create separate attenuation settings that would have the spatialized checkbox on in this case.

Is this the correct setting to turn off, or is there something else sounds use when we don’t have separate attenuation settings?

Our team is also testing your solution of creating attenuation settings and turning off spatialization there, but we were just suspicious because the code was already trying to do what you suggested.

Will do, thanks!

Just an update, it did not work. Two things of note:

  1. UVoiceOutput is inheriting from USoundWave, not USoundWaveProcedural
  2. Setting bSpatialize to false on UVoiceOutput did not work

Not sure if it matters that we are inheriting from USoundWave if GeneratePCMAudio is present on both?

`OnGeneratePCMAudio` is first defined on the USoundWaveProcedural class. At first I thought you were using that mechanism for generating audio. It sounds like you have some custom implementation around UVoiceOutput::GeneratePCMAudio. How is that method called and ultimately fed to the rendering engine?