Announcement

Collapse
No announcement yet.

Pitch-shift source effect (DSP) over the network (VOIP)

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Pitch-shift source effect (DSP) over the network (VOIP)

    Hi! This is cool stuff i'm going to write here a bit about. I've been managed to put together a source effect DSP that is being used to change the pitch of an audio that is played back. This live DSP effect is based on the new engine feature of the unreal engine, called the Audio Mixer. You can read about in the topic below in order to set up some cool stuff with it:

    https://forums.unrealengine.com/deve...ck-start-guide

    Availability

    The Pitch Shift DSP will be a new stuff and you can't find it in the engine just yet, however i'm planning to make a pull request of it to the unreal engine master github repository so you can all have this extension in your game engine to use it for whatever reasons you just wish!

    Video

    Here is a short video i put together where you can see this stuff in action! Hope you will enjoy this demo as much as i did making the extension



    About

    But of course, we can already change the pitch of the playback, so what this DSP is actually good for?! This is the right question and i'm going to answer it in a moment. Generally speaking when you change the pitch of an audio sample played back, there is the speed of the playback which is also going change with it, so the higher the pitch the faster the playback speed will be, and if you lower the pitch it all gets slower. That's not always work out very well, especially if you wish to keep the playback in sync with your other materials (eg. video media, subtitles, rhythm of the music etc). Alternatively you can double the audio frames to keep the audio length intact, but the sampling will be invalid so it will cause distortions and unwanted side effects.

    Here comes in the equation the Pitch Shifting DSP, which will directly change the pitch of a Fourier transformed signal in the frequency domain while retaining the original audio duration. Yes! It is a very similar approach to the Phase Vocoders, and it allows you to preserve the duration the audio clip while changing it's pitch, so this makes it suitable to be placed in the DSP chain and it won't cause desyncing with you other media contents. Not much desync, actually! The tiny problem with this is, that it require a small buffer under the hood in order to gather and process the audio data, therefore the output will delay a bit. But it's a constant delay, determined by the size of the FFT window you will set up, so you can adjust your audio settings against this small inconvinience.

    I'm sure those who had previous experiences with digital audio softwares (eg fruity loops, cubase, studio one, etc) and VST's are very well familiar with audio latency which sometimes comes as the result of the many choosen effects on the insert chains of the project, which they always require some time to produce the audio output. This you can calculate and adjust your DSP chain timings against to keep the sync in your final mix. Usually DAWs https://en.wikipedia.org/wiki/Digital_audio_workstation does this a way, it will delay the entire master mix to allow the individual effects to produce the sound in their own little time domains, so the end result will be in sync at the output.

    Here in unreal engine what you can perhaps do is to start the playback of the pitch shifted sample a tiny bit earlier (earlier than your other media contents), and this will help you to keep the sync with everything else very well for any period of time, since the playback speed will won't actually change ever.

    Pitch Shift DSP

    In the following i'll show you the actual usage of the Pitch Shift DSP as well. It is fortunate that the authors of the new Audio Mixer, dan.reynolds and Minus_Kelvin have put together a very nice article that will help you to set up DSP effects in unreal engine, so i'm just gonna link their article and you can learn more about the general setup of source effects straight from them!

    https://forums.unrealengine.com/deve...368#post982368

    Once you have finished with their brilliant tutorials, you can go to the effect's panel where the parameters of pitch shift can be adjusted. There will be three parameters available and they change the frame size, oversampling and pitch respectively.

    The pitch is very obvious what it does, but it's characteristic is important to mention here. The value range of the pitch parameter will be between 0.5 (50%) and 2.0 (200%) that is a 2 octaves range to adjust the pitch. 0.5 is the low (eg C0) whereas 2.0 will be the high (C2) when your normal playback is (1.0 / C1) pitch.

    The frame size will determine the FFT window's size the algorithm will use. It is limited to be between 128 and 8192. The smaller the window, less the audio latency will be on the output, BUT it will have a huge impact (degradation) on the quality of the end result as well so it becomes worse. In my experiences the 1024 is a rather good window for the in-between with quality and latency, but you can go much higher for the best quality as well. The value must be a power of 2, but the algorithm will keep you on the safe side and any value you put in there the closest power of 2 will be actually used. So don't worry you won't cause glitches nor any crazy noises with your uneven values, it's all handled internally.

    The oversample is the STFT https://en.wikipedia.org/wiki/Short-...rier_transform oversampling factor where a number of 4 should give you a rather natural voice without any quality loss. You can go higher with this value, but i clamped the value to 32 that's the maximum you can use. It's possible to change this to allow higher values as well, tho i don't really see the point of that.

    VOIP Prototype

    You can use this DPS with any sound source in the game engine, this is not tied to a VOIP solution by any way! It just for the fun i set up our custom networked VOIP solution here to show you it actually works with a live voice input as well without any troubles. The VOIP solution being used here is a project we are working on, in and out for the wast majority of this year.

    This prototype voip solution will capture and transmit the OPUS encoded voice packets over the network (using simple value replication on the actor channel) to the receiving end, where they are gathered and after some network stability adjustments (will cause some latency) it finally gets played back. Usually the latency of the VOIP in this prototype will add up in a continous manner so the longer you talk more the lantecy will be, however this helps actually rather well to keep the voice continous and stable! This and many cool features of a VOIP will be packed into a plugin (something we plan to name as Pro Audio Capture) which my Colleague and I are going to share with you all guys, most likely as a marketplace item at some point in the future, so you can all have it for your games and other uses as well!

    Other uses

    While i gave the name Pitch Shift DSP to this effect, it actually does not cover many other uses you can possibly will find. For example, if you can set up and use a MIDI input in unreal engine, you can have a very cool choir effect by using it in monotone or polyphony by applying multiple instances of it. Whether its a human voice or any other synthesized sound, the smoothing characteristic of the output can also be used as a rather unusual filter for your fine atmospheric sounds and melodies. But i'll let you find your use cases, and we're very interested to hear about your cool ideas!

    Links

    Here i put you some links so you can sail on the webs and learn more about pitch shifting, vocoders and DSPs in general. These are cool stuff and many things to learn about!

    https://en.wikipedia.org/wiki/Audio_signal_processing
    https://en.wikipedia.org/wiki/Digital_audio_workstation
    https://en.wikipedia.org/wiki/Phase_vocoder
    https://en.wikipedia.org/wiki/Pitch_shift
    https://en.wikipedia.org/wiki/Short-...rier_transform

    Final words

    Once again, this new DSP effect is planned to be available at some point in the github repository of unreal engine on the master branch and hopefully will be pulled to the actual engine code, so if you find this stuff useful you can maybe help us to make this happen and just put your votes on the PR page. Thanks a lot for your support!

    I'll keep this post updated, and will give you links to the forthcoming pull request and the audio capture + voip stuff so you can keep yourself informed regarding these matters.

    Cheers!
    Konflict and spaceharry
    * Sharp and responsive Temporal Anti-Aliasing tips and tricks
    * Pitch-shift source effect (DSP) over the network (VOIP)
    * My Portfolio and Developer Blog

    #2
    This is awesome! Please do send us the code... FFT-based effects are on our list of things to get to. You beat us to it!

    Comment


      #3
      Hey guys, do you have a website or twitter feed for you? I'd like to tweet a link to this forum post

      Comment


        #4
        This is SO AWESOME!

        Click image for larger version

Name:	doctorfantastic.gif
Views:	26
Size:	499.3 KB
ID:	1366936
        Dan Reynolds
        Technical Sound Designer || Unreal Audio Engine Dev Team

        Comment


          #5
          Thanks, that's very nice of you!

          Yes i'll intent to send the code as soon as i feel it's ready to. I try to improve on performance while not consuming too much memory, that's my priority for now! It's FFT based and it's very performance intensive as you are well aware, but it could be worse i guess! Ran some tests on a I7 4790 and found that with default project settings on 48K i can run 16 pitch shift instances (parallel) with 1024 window and 4 oversample, without actually hitting the limits of the audio engine. So far so good, but it's not final result. I just hope i can keep this level up.

          Originally posted by Minus_Kelvin View Post
          FFT-based effects are on our list of things to get to. You beat us to it!
          No i didn't mean to beat anybody, it's just fun and i enjoy working on the bits and pieces of this code. Wish i could spend more time on this, but i'll find my ways to keep myself focused on this project. If you have better, less performance consuming solutions to FFT's i'm sure you will be the one who beats me on this

          Originally posted by Minus_Kelvin View Post
          Hey guys, do you have a website or twitter feed for you?
          Well, yes or no! It's more like a portfolio blog i run with all my crazy science, but i just updated both the website and my footer to have it displayed properly. Here is the url to my blog. https://www.nephiliminteractive.com/

          Originally posted by dan.reynolds View Post
          This is SO AWESOME!
          Yes sir, i absolutely agree It's fun to work on and many possible uses ahead!

          I just hope the implementation will hit the requirements of an acceptable pull request. It's not an easy task, really gives me the scare
          * Sharp and responsive Temporal Anti-Aliasing tips and tricks
          * Pitch-shift source effect (DSP) over the network (VOIP)
          * My Portfolio and Developer Blog

          Comment


            #6
            One more thing worth to mention. The thing works the other way around as well. So, i can slow down the playback of the audio, and by using the Pitch Shift DSP adjust against the pitch, i got a time stretch fx as a result.

            Click image for larger version  Name:	timestretchfx.jpg Views:	1 Size:	75.9 KB ID:	1372882
            Simple as that! It is fun

            Last edited by Konflict; 10-20-2017, 04:25 PM.
            * Sharp and responsive Temporal Anti-Aliasing tips and tricks
            * Pitch-shift source effect (DSP) over the network (VOIP)
            * My Portfolio and Developer Blog

            Comment


              #7
              MUSIC. NON-STOP.
              MUSIC. NON-STOP. (This is amazing. Good job!!!)
              MUSIC. NON-STOP.
              MUSIC. NON-STOP.
              MUSIC. NON-STOP. (help)
              MUSIC. NON-STOP.

              Comment


                #8
                This is one of the nuttiest demo's I've ever seen.... In a good way.

                Comment


                  #9
                  I am so glad I stumbled across this. Awesome work Konflict. I'll be keeping a watchful eye on this!

                  Comment


                    #10
                    Originally posted by Minus_Kelvin View Post
                    This is one of the nuttiest demo's I've ever seen.... In a good way.
                    Sorry about that Since then i've been managed to implement the KissFFT from the third party components in order to replace the built-in FFT. I can't say for sure, but the results are rather similar, and work flawlessly. Once again thanks for the tip, i would have missed the option without notifying me about it's availability!

                    Originally posted by Derjyn View Post
                    I am so glad I stumbled across this. Awesome work Konflict. I'll be keeping a watchful eye on this!
                    Cool! It's been a lot on my plate recently, hence the release of the plugin is postponed a bit. Nevertheless the development has continued to a certain direction, since i was extended the equations to provide some interesting options to adjust the behavior.



                    This incarnation of the pitch shifting DSP (gave this a name as AlienSpitch FX) is provides a great variety of features to change the characteristic of a human voice (not exclusively, but mainly designed for that).

                    It is currently using float curves to map the 0-22khz (value ranges on t is 1.0 to 10.0) frequency ranges, so it is possible to adjust the behavior of phase and pitch of the processing on certain frequency ranges individually. Along with that i also added a "frequency caching" (having difficulties to find a better name for this feature) method, which is a temporal effect that will extract frequency deltas between frames thus allows the manipulation of how much you want the individual frequencies to change over time. It's easy to produce a monotone speaking, or simply by overdriving the pitch deltas will cause more articulated changes.

                    Now that i try to explain it may sound a bit complicated but it actually is not. Very easy to use, altho i have yet to find soltuions to adjust the float curves from blueprints. It is possible from c++ and by using the asset which i demonstrated in the video.
                    * Sharp and responsive Temporal Anti-Aliasing tips and tricks
                    * Pitch-shift source effect (DSP) over the network (VOIP)
                    * My Portfolio and Developer Blog

                    Comment


                      #11
                      Veeeery cool! Seems so fun to play with. Looking forward to the plugin.

                      Ps Let us know if you find out a neat way to edit curves in BPs!

                      Comment


                        #12
                        Hey Konflict, how goes the progress? Haven't heard anything in a few months. I'm working on a custom build of the engine, and while working on the todo/roadmap and getting into the audio section, this popped up in my memory! Hope all is well.

                        Comment


                          #13
                          Awesome work, Konflict!
                          Dan Reynolds
                          Technical Sound Designer || Unreal Audio Engine Dev Team

                          Comment

                          Working...
                          X