I managed to get this all working a while back (Opus on iOS, with audio seeking support) and thought it might be worth posting an update.
I ended up making various changes to allow Opus audio to work via the ReadCompressedData()
code path, rather than the StreamCompressedData()
code path. The former is referred to in the source as ‘real-time’, and the latter as ‘streaming’. Conceptually, it seemed to me that the former ought to be just fine for Opus playback (including seeking), since in my case all the data is available locally in the app package.
(The ‘streaming’ code path seems to have a separate manager that loads ‘chunks’ of the audio file/stream, where a ‘chunk’ may contain e.g. the header plus a number of compressed audio packets. This may be intended to allow streaming where not all the data is yet available e.g. when streaming over the network)
Following are the changes I made, going in to a little detail, in case that’s of interest.
Disclaimer: These may not represent tidiest / most correct approach, to someone who’s more familiar with the UE audio code. My aim was to get this working for the project I’m working on, for the platform I’m working with (iOS) - though I did try to avoid making any changes that would cause issues on other platforms.
Firstly, made changes as per my previous posting so that the cooked audio is output in Opus format, and so that the app links with the Opus decoding library.
- Updated
FAudioFormatOpus::GetBitRateFromQuality()
to return a sensible bit rate (32000 seems to produce decent enough results)
Updated ReadCompressedData()
to pass out the number of samples actually produced, and up to the calling layers.
- In
FDecodeAudioTaskResults
added NumSamplesWritten
member, as is already present in FProceduralAudioTaskResults
- In
FMixerSourceBuffer::ProcessRealTimeSource()
updated the EAudioTaskType::Decode
case to call AudioData.SetNum()
(at least, if NumSamplesWritten
has been set - I decided to consider that optional) as is already done in the EAudioTaskType::Procedural
case
- In
FAsyncDecodeWorker::DoWork()
under the EAudioTaskType::Decode
case, updated the second ReadCompressedData()
call so that the actual number of decoded samples is passed out, setting DecodeResult.NumSamplesWritten
Updated IStreamedCompressedInfo::ReadCompressedData()
so that it only decodes whole Opus packets at a time (a ‘packet’ being a compressed frame of e.g. 60ms worth of samples). Once there’s not enough room left in the destination buffer for the result of decoding the next packet, it will stop (won’t fill that buffer) and pass out the actual number of samples output.
- Added a virtual method to
IStreamedCompressedInfo
to tell us if this decoder needs to decode whole (compressed) packets at a time i.e. enable the behaviour mentioned just above. The implementation in FOpusAudioInfo
just returns true.
- Call that in
ReadCompressedData()
- if the result is false, fall back to the previous behaviour within this method. However, if it returns true…
- Keep (existing)
RemainingEncodedSrcSize
up-to-date during the loop
- If the above has hit 0 (in the top of the loop), we’ve finished playback of this sound. Check for this case in the existing ‘if’ before where it sets
bFinished
to true and does the ‘if looping’ check.
- Get the maximum bytes that decoding the an Opus packet will result in (another additional
IStreamedCompressedInfo
virtual method, implemented in FOpusAudioInfo
) - it already has available the number of channels, sample rate and Opus frame size in ms. Beware, OPUS_MAX_FRAME_SIZE_MS
was set to 120ms even though elsewhere the Opus frame size is actually configured as 60ms - we certainly want the smaller size.
- If there isn’t enough room in the destination buffer, we’re done for now, so pass out the actual number of decoded samples.
- Call
GetFrameSize()
(that already includes updating the SrcBufferOffset
member)
- Recalculate the source data pointer i.e.
SrcBufferData + SrcBufferOffset
, and call Decode()
passing that the result from GetFrameSize()
i.e. the actual size of the compressed Opus ‘packet’.
- An update/fix in the ‘if looping’ check mentioned earlier: If we are looping, set
DecodeResult.NumCompressedBytesConsumed
to 0 to ensure SrcBufferOffset
isn’t incorrectly modified before the next loop iteration.
- Have that also call a new
FOpusAudioInfo
method that prepares for looping by calling opus_multistream_decoder_ctl(Decoder, OPUS_RESET_STATE);
Updated FOpusAudioInfo::Decode()
to always populate all three values in the FDecodeResult
struct. Note that Result.NumAudioFramesProduced
means the number of samples per channel.
- I always set
NumCompressedBytesConsumed
to the input CompressedDataSize
value
- …but set
NumPcmBytesProduced
to 0 if OpusDecoderWrapper->Decode()
returns <= 0
To add support for seeking Opus audio, I added a table to the Opus audio format’s file header, storing the size of each ‘packet’ (uint16
array). On load, I convert that to a uint32
file offsets array (relative to the end of the header, not that that’s too important). Given each compressed frame/packet is a fixed length in terms of time, e.g. 60ms, it means for a given playback time we can easily find the compressed data offset for the packet, and the sample offset within that packet’s resulting decoded audio samples.
- For debug purposes, a binary search in that table allows you to convert a compressed data offset back to a packet index and playback time.
- Obviously needed to update
UE_AUDIO_OPUS_VER
(which conveniently results in all audio assets being re-cooked), update FAudioFormatOpus::Cook()
to populate this new table, plus update SerializeHeaderData()
- Obviously needed to update
FOpusAudioInfo::ParseHeader()
to deal with the new header, and convert the packet sizes to an array of offsets.
- Probably worth updating
GetMinimumSizeForInitialChunk()
, to include the size of this table in the returned offset (reading the packet count from the header along the way, using the provided SrcBuffer
)
- Probably worth updating
SplitDataForStreaming()
The main trick to seeking with Opus is that, when doing so, you should reset the state of the decoder (call opus_multistream_decoder_ctl
with OPUS_RESET_STATE
) and then decode the whole of the preceding Opus packet (if there is one) so as to prime the state of the decoder. We are then ready to decode the actual target packet/frame.
- Added
FOpusAudioInfo::SeekToTime
override that converts the desired playback time to the Opus packet index and a sample offset, relative to the start of the samples that’ll result when decoding that packet. Store that in some member vars i.e. a seek request.
- Updated
FOpusAudioInfo::GetFrameSize()
to check for that seek request, and update SrcBufferOffset
to refer to the packet (or rather than uint16 that was already stored before each packet) offset in the file. I also added some member vars to track/move this from being a seek request to being an in-progress seek (in part to confirm that GetFrameSize()
is called before Decode()
). As mentioned before, note that the packet we’re referring to here is the one preceding (if there is one) the packet that contains the actual seek target audio.
- Updated
FOpusAudioInfo::Decode()
to check for the in-progress seek. If so, make the opus_multistream_decoder_ctl
call with OPUS_RESET_STATE
. If there’s a preceding packet, decode it and discard, and update SrcBufferData
and the pointer to the next source compressed data/packet. After decoding the actual packet of interest, we may then need to skip over some samples - which could be done by shuffling up the generated data in its buffer, or passing out a pointer to the start of the usable data. Obviously NumAudioFramesProduced
needs to be updated too, based on the skipped sample count.