Capturing voice for VOIP + Wwise setup

Hi,

I’m working on a VOIP system that feeds the captured voice through Wwise to leverage some of its spatial audio features. I’m working with 4.25.
I’ve been following Wwise Unreal integration documentation on how to set this sort of stuff up, but it seems like either that documentation is outdated, or there is an issue with my setup.

As it stands, I’m using the IVoiceCapture to get the voice but I fail to get any data out of it with this example code (this is run inside my custom audio capture component’s tick):


 uint32 numOfAvailableVoiceCaptureBytes = 0;
EVoiceCaptureState::Type captureState = m_voiceCapture->GetCaptureState(numOfAvailableVoiceCaptureBytes);

if (m_voiceCapture->IsCapturing())
{
UE_LOG(LogAudioInputTest, Verbose, TEXT("Is currently capturing: TRUE"));
}
else
{
UE_LOG(LogAudioInputTest, Verbose, TEXT("Is currently capturing: FALSE"));
}
UE_LOG(LogAudioInputTest, Verbose, TEXT("Current capture amplitude: %f"), m_voiceCapture->GetCurrentAmplitude());


if (captureState != EVoiceCaptureState::Ok || numOfAvailableVoiceCaptureBytes == 0)
return;

uint32 numOfVoiceCaptureBytesReturned = 0;
TArray<uint8> incomingRawVoiceData;

incomingRawVoiceData.AddDefaulted(numOfAvailableVoiceCaptureBytes);
uint64 sampleCounter = 0;
m_voiceCapture->GetVoiceData(incomingRawVoiceData.GetData(), numOfAvailableVoiceCaptureBytes, numOfVoiceCaptureBytesReturned,
sampleCounter);

if (numOfVoiceCaptureBytesReturned == 0)
return;

The GetCaptureState always returns no data, and the voice capture is capturing only during the first tick after I enable voice capture (have it bound to a key), with the current aplitude being output as 0 regardless. No matter what I try and how I produce sounds in front of my mike, I never get any input.

I tried setting the console variables of the voice system to make the noise and silence attack longer (3 seconds), but it didn’t change anything. So really I’m kinda stuck on this.

So my questions are:

  1. Is the IVoiceCapture still usable with 4.25?
  2. Are there any solutions I should much rather use? My goal is to only capture the input buffer, then send it for replication and upon receiving it on the client sides feed it to Wwise.
  3. Are there some tests or debug features I could run to help me debug this issue if the IVoiceCapture is still the right thing to use for this sort of stuff?

I would really appreciate the input on this. I have quite a lot of experience with c++ programming, including audio programming, but this is the first time I even have to deal with any voice API and I’m a bit lost.

Cheers,
Adrian