Networked Physics with PhysX

Hey all, I did a bit of cleanup and* split this thread off from the original one**, and also merged a few posts into single ones to make reading easier. Ultimately it seems several of us are working on a solution to this, so I figure it’d be nice to have one concise thread with the info/progress in it.*

Moving on:
I have posted a compiled version of what I have going at the moment. It seems as though while this does work without fixing timestep, the networking does work but has a lot more errors. Clamping FPS to 60 in two separate instances pretty much removes a huge chunk of the Bad Moves, so it’s clear that fixing the timestep certainly results in reduced bandwidth usage.

You will notice however, that input on the client feels instanteneous regardless of lag, so prediction/reconciliation IS working.

TODO List:

  • Add Smoothing
  • Use a separate PhysX scene for replay. This will stop the stuttering and collision errors with other non-local pawns on clients.
  • Use a locked PhysX timestep version of the engine (@0lento is working on a pretty cool solution for this)
  • Possibly copy static collision geo into Replay scene to reduce errors?
  • Check to see if PhysX has any other native features to aid Determinism.

(^ Source / Plugin available once this is done)

Project Download Here
Just startup two (or more) instances and join via localhost or over IP. You can choose between a hovering cube pawn, or a physics sphere pawn. Both use PhysX.

GitHub Project & Source Code

Controls shown in bottom-right of screen. You can change the latency per-client. I’ve not tested if the lag simulation settings work over non-lan connection, but that kind of defeats the point anyways…

a80b38ed1445568d4883f62e0cf539c97a83bbf8.jpeg

Small status update from work I did yesterday for the fixed timestepping:

I got physics step interpolation in, so you can now optionally smooth the tranform between two physics steps to estimate where the physics are at the time UE4 renders frame.

Here’s extreme example of running physics at 10 Hz with fixed timesteps so you can see interpolation better:

Without interpolation (forgot motion blur on so it blurs it a bit):

With interpolation:

Both gifs are captured at ~60fps.

Looks really good that!

I wonder how blending will affect some other rendering features, like Temporal AA etc - or things that rely on accurate motion vectors. In that first gif it looks like TAA isn’t working as well around the edges for example, but with smoothing it doesn’t seem to have many issues.

It shouldn’t do anything differently regarding to rendering, I do the interpolation in same place where physx simulation results are synced, I just send a lerped transform there instead of using the one that physics sim would have given. Rest of the UE4 will not know about this. I also do this only once for each tick after all physics have been simulated (just like ue4 does with all the physics sims, it only updates scenecomponents after all physics steps are done).

edit-> regarding TAA, it’s not TAA, it’s the motion blur that messes the image up on the nonsmoothed version.

I’ve updated my post above with a link to the GitHub source for the project (minus the pretty stuff since that’s from SuperGrid)

This is pretty cool :slight_smile:

I’m replying a post by @a-tocken from this thread.

I have a some kind of solution for this already but it’s always a compromize. I’ll probably push the commit into my ue4 fork at the end of this week. My changes include also primitive component interpolation between last two fixed timesteps + all needed settings to customize this. I’ll probably not make a PR out of this because I feel it could be done so much better if done fully in a parallel thread but that would require a lot more thought.

What I do myself is that I have two count limits for fixed timesteps. First one defines how many fixed timesteps can be between two ticks when the game runs in normal condition. When this value is exceeded due to small cpu spike etc, it starts to add counts into cumulative counter and there’s a separate hard limit for this one. Cumulative counter is always added current amount of substeps and reduced by max regular condition substep amount (and clamped into 0). Basically what this system allows is that you can momentarily exceed the fixed timestep counts when deltatime is huge for just few frames (like it’s in PIE during begin play). If this condition stays and the cumulative counter never cools down, it falls back into regular substepping. One could just freeze the sim at this point, like you suggested but that could have all kinds of other side-effects (your game time is still running when you do this but physics don’t move etc). Anyway after cumulative counter is empty again, it uses fixed timestepping again.

In theory, it should never go to this unless game totally freezes up (if you block the game thread yourself) or you run really intense physics sim on too weak CPU. Latter scenario would never run properly with this kind of system and your suggestion of just stalling the physics between the ticks after limit would probably look more pleasant to the user (but would mess up the sim for computers that can handle more steps momentarily). If this is an issue, there could be additional checks to fallback permanently to sim freezing mode in case the sim just keeps throttling back and forth to the hard limit but this starts to become quite messy already. :slight_smile:

There are many ways to approach this problem and it’s usually specific to your game. Delaying server for all players to some safe value would be the easy solution (but more you delay, more time it’ll take for all clients to get their corrections from server).

While 100% determinism would be nice, getting close enough is often enough. There’s a night and day difference on how closely same PhysX sim goes with fixed timestepping and physx just ticking on Tick. See my post here:
Networked Physics with PhysX - C++ - Epic Developer Community Forums Of course this isn’t extensive test as it’s just few rigidbodies but tells much about the simulation itself when even using simple test.

PhysX 3.4 has additional scene flag to enhance determinism but according to the source comments, it mainly makes physx sim more deterministic when there are changes in the physics scene that shouldn’t directly affect existing rigidbodies (so I’d guess it just guarantees that physx keeps same order amongs existing rigidbodies for it’s internal calculations).

I’ve spent few days converting TheJamsh’s networked physics plugin to be more friendly to fixed timestepped substeps but it’s still WIP. Biggest issue with this approach is that you’d really need to delay the server a lot or do some fancier solution where server simulates individual pawns separately. This wouldn’t be as big of an issue if you could actually run the physics sim in a another thread at fixed time intervals as then you’d get constant input updates for the server sim. Now you can get different amount of fixed timesteps between your ticks but your input can be only polled once per tick. You could poll input on each substep but it wouldn’t make any difference since substeps are not truly that far in realtime from the last tick (remember they get all run as fast as CPU can handle them, not at fixed intervals).

All in all, this means that you can’t send the server steady input packets without completely fixing the rendering framerate for the game. And fixing rendering framerate is not really an option that many PC developers want to take (could work for consoles).

[Edit]: Whoops this got replied to the old thread. Meant for it to go to the other one

This is a very long list unfortunately. I’ve spent 4-5? months on this so far, although i’m making quite a bit more than just “rewind and replay my character”. Ultimately I want a system that’s easy-ish to throw together multiplayer physics-heavy prototypes without having to write custom and error-prone rewind/replication code every time. Also trying not to break variable and semi-fixed stock UE4 modes as well, which adds some extra effort.

The major points are:
-Fixed timestep. And more exposed physics pipeline to external systems, e.g. PreStep()/PostStep()/UpdateTime()
-Interpolate physics objects for smooth rendering. Posted how to do this somewhere else…don’t have the link handy atm
-Time synchronization between client and server. How far ahead of the server should the client be? What happens when client hitches? server? what about fluctuations in latency? Server needs to receive player inputs before it simulates the frame
-Lots of small assumptions (which were 100% intentional, mind you) in the stock UE4 code that ultimately rely on “physics time will always be in sync with game time at the end of a tick”. One example is the TargetBuffer system which is used to queue forces/impulses/torques when substepping is enabled. This buffer is cleared after all substeps, which won’t work for a fixed step. That also raises a fundamental problem: If a blueprint adds a force but we dont actually tick physics this game frame, should it be applied on the next step? If this happens inside Tick(), we could be applying 2 AddForce() calls on the same physics step, which probably isn’t what we want. Lots of little issues like this along the way.

That’s only normal if you can’t possibly predict what some objects are going to do, e.g. another player’s input. In your example, I would expect basically perfect predictions, because all the inputs are known. You’re seeing “bad moves” due to using variable timesteps, not due to the determinism of physx. If you don’t believe me, make a quick test scenario. Add some forces, then log the position after each step, and compare a few different runs. Just hardcode some value into the simulate() call (thereby ‘fixing’ it).

You could use a separate scene just for rewind & replay of you character, sure. You just have to know that any interactions with any other physics objects will always cause desync due to your prediction not knowing about them. Or if you dont really have player-to-player collision or otherwise dynamically interact with the player’s physics throughout the game, then you’d be fine. That’s not the system I was going for though.
I’d be wary about the whole “rewind with things in my vicinity” logic, cause that will change as you are stepping forward and gets better/worse with ping and velocity. Seems like it would cause a lot of hard-to-trackdown bugs, similar to actor relevancy issues. Maybe a good tradeoff would be to store snapshots for everything, but only add nearby actors to the replay scene before every step. Then you’d have the history but wouldn’t have to simulate the whole thing. Could probably write a relatively cheap “get all rigidbodies inside sphere around character” function which queries the histories.

Hey so I’m working off Snowcrash5’s fixed timestep code from the other thread. It works okay… His code had a divide by zero at high frame rates which I cleaned up, but still at high framerates there are some serious issues. Has anyone else experienced this? If i’m playing at 120 fps but physics is at 60, that means pretty much every other frame I will not have a physics update → SubTime <= 0.f and FPhysScene::SubstepSimulation will early out. What happens, though, is if I’m using my existing spring code, the vehicle starts hopping instead of hovering. If I move that code into a CustomPhysicsFunction, I get even worse movement as the CustomPhysicsFunctions don’t get properly cleared when SubTime <= 0.f. So then I add the ability to properly clear existing CustomPhysics, Forces, Impulses, and Torque requests but then the vehicle just doesn’t hover anymore. It’s starting to feel like this is a UE bug? I’m just using an existing code path (SubTime <= 0.f) that apparently almost never triggers in a standard substepping. I’ve looked over the substepsimulation code repeatedly and there’s no apparent dependencies for it to run every game frame - though the fact that it bugs out seems to suggest otherwise.



float FPhysSubstepTask::UpdateTime(float UseDelta)
{
	// following changes change substepping behavior to be a fixed time step instead
	const UPhysicsSettings* PhysSetting = UPhysicsSettings::Get();
	const float FixedFrameRate = PhysSetting->MaxSubstepDeltaTime;
	const float FrameRateInv = 1.f / FixedFrameRate;
	float RequiredSteps = (ExtraTimeBank + UseDelta) * FrameRateInv;			// given how much dt (and time from previous frame) we have, how many steps can we take at this phys frame rate
	int32 NumActualSteps = FMath::FloorToInt(RequiredSteps);				// we can only take a whole number of steps
	if (NumActualSteps > PhysSetting->MaxSubsteps)
	{
		NumActualSteps = PhysSetting->MaxSubsteps;
		ExtraTimeBank = 0.f;
	}
	else
	{
		ExtraTimeBank = (RequiredSteps - float(NumActualSteps)) * FixedFrameRate;	// save partial steps
	}

	NumSubsteps = NumActualSteps;
	DeltaSeconds = FixedFrameRate * NumActualSteps;

	SubTime = DeltaSeconds > 0.f ? FixedFrameRate : 0.f;
	return SubTime;
}


You can’t really use physics from Tick if you use fixed timesteps, I have a feeling that existing substepping code expects there to be a least one physics step but I have verified it (edit-> I mean now if you run AddForce etc directly from Tick). All in all, it doesn’t make any sense for you to add any forces on tick if you have access to the physics steps. As additional note, if you do add forces on substep directly, make sure you set the boolean flag on AddForce to false, otherwise it’ll not work properly.

Why not? I mean, I don’t want them to be calculated in the standard Tick for consistency reasons between client and server, but they should behave basically the same for a standalone simulation, and the queued Forces and Torques would be applied with a substep DT just as they would with normal substepping (and not applied if no fixed tick occurred). So again, I don’t want to calculate my physics needs in standard Tick, but I’m just trying to get on the same page in understanding this. I am using the CustomPhysicsFunction normally, I just am testing the code.

So have you not done fixed stepping? If you have, how can we say this is fixed timesteps if we are required to tick every game frame… I feel like then we’re losing the advantage of fixed timestepping because now if i have a server playing at 60hz and a client playing at anything above 60, then we’re going to always be off, rather than occasionally.

Yeah, again while it doesn’t make sense for networking physics, it should still work okay unless I’m missing something.

I’ll try this out but this would only be an issue if my game framerate is below 60 right? For when I end up double ticking physics?

It wouldn’t work properly as with fixed timestepped physics your physics sim and rendering will be totally async. You might get many physics steps or none, depending on your framerate. If you just use constant AddForce on your scene, it wouldn’t directly break but you rarely do this if you have a custom physics actor. It’s more common that you constantly adjust your forces for each physics step and that’s why the Tick route will usually work extremely badly with fixed timesteps.

We really don’t tick fixed timesteps for every rendered frame unless we need to. That’s what the timebank is there for.

Just to make sure, when you look at AddForce: FBodyInstance::AddForce | Unreal Engine 5.2 Documentation, you can see bAllowSubstepping boolean there. It has to be set to False if you call that AddForce directly from substep/physics step. This might feel odd due to the naming but the flag is there only to be enabled when used from Tick so engine internally calls that addforce on each substep.

Yeah so I’m just confused about what you were saying. Previously you said “substepping code expects there to be a least one physics step” which made it sound to me like you were were doing a fixed tick for every rendered frame (which wouldn’t be fixed). My post is asking if anyone experienced any issues where fixed timestep doesn’t really work because it would appear the engine requires the physics simulation to be ticked on every frame no matter what. See below.

Yeah I am using that now. It’s still broken though. The PhysTargetMap occasionally accrues an inordinate amount of CustomPhysics. Like the CustomPhysics delegate will fire over a dozen times in a frame every time the timebank adds an additional substep call. So when it’s working normally it looks like:
(120 fps, 60 fps physics)
Frame 1: RegTick (.5 steps, move to timebank)
Frame 2: RegTick (+.5 steps)
Frame 2: FixedTick (we have a whole step)
Frame 3: RegTick (.5 steps)
Frame 4: RegTick (+.5 steps)
Frame 4: FixedTick (you get the idea)
Frame 5: RegTick
Frame 6: FixedTick (something has changed)
Frame 6: FixedTick (why are we still calling FixedTick this frame)
Frame 6: FixedTick
Frame 6: FixedTick

Frame 7: RegularTick (oh thank god we’re back)
Frame 8: RegularTick
Frame 8: FixedTick
Frame 9: RegularTick

The problem is that we don’t tick perfectly at 120fps and 60fps. So the time bank begins to grow an additional tick (as expected). However, it seems when it does this additional physics tick, that the PhysTargetMap for the PhysSubTask does not get properly cleared and then that frame, as I mentioned, fires the CustomPhysicsFunction delegate many times because it was added many times without ever being cleared. You can see it’s supposed to get cleared in SubstepInterpolation if its the last step (which we wouldn’t have if this function never ticks because SubTime <= 0.f) or if the Alpha accrued from interpolation is > 1.f (which I dont entirely understand).

TL;DR: bAllowSubstepping = false advice was helpful (so thanks!). But I still feel like something isn’t right.

I can see that my comment was bit confusing. I only meant that I don’t know (I haven’t checked the source for this) what happens when you add force with default AddForce and there’s no substeps after it when you don’t set that boolean to false (like we both mentioned already, there might not be fixed timestepped physics step between ticks at all).

It’s possible that there’s some logic fault on the code you are using for fixed timesteps, it looks more complicated than it has to be for that kind of behavior. Something like this should work (do note that this code is without any failsafe for CPU starvation):


float FPhysSubstepTask::UpdateTime(float UseDelta)
{
    float FrameRate = 1.f;
    uint32 MaxSubSteps = 1;

    UPhysicsSettings * PhysSetting = UPhysicsSettings::Get();
    FrameRate = PhysSetting->MaxSubstepDeltaTime;
    MaxSubSteps = PhysSetting->MaxSubsteps;

    SubTimebank += UseDelta;
    NumSubsteps = FMath::FloorToInt(SubTimebank / FrameRate);
    DeltaSeconds = NumSubsteps * FrameRate;
    SubTimebank -= DeltaSeconds;
    SubTime = NumSubsteps ? FrameRate : 0.f;
    return SubTime;
}


Your code is almost logically identical to mine. The only difference is I have the failsafe for CPU starvation. This isn’t a number of substeps problem I’m only computing 0, 1, (very rarely 2) substeps. The problem is that the PhysTargetMap isn’t properly managed if SubTime is 0. I added this for now to fix:



void FPhysSubstepTask::AddCustomPhysics_AssumesLocked(FBodyInstance* Body, const FCalculateCustomPhysics& CalculateCustomPhysics)
{
#if WITH_PHYSX
	//Limit custom physics to non kinematic actors
	if (Body->IsNonKinematic())
	{
		FCustomTarget CustomTarget(CalculateCustomPhysics);

		FPhysTarget & TargetState = PhysTargetBuffers[External].FindOrAdd(Body);
		// my added bit below
		if (!TargetState.CustomPhysics.ContainsByPredicate(&CalculateCustomPhysics](const FCustomTarget& ExistingTarget) { return ExistingTarget.CalculateCustomPhysics == &CalculateCustomPhysics; }))
		{
			TargetState.CustomPhysics.Add(CustomTarget);
		}
	}
#endif
}


But anyway, too many CustomPhysics are being added. Not too many substeps are being calculated.

Ah now I get it, where do you add your custom physics? You should add in only on Tick. Now that I’ve worked with fixed timesteps I’ve used 4.15 PhysSceneStep delegates instead.

I’m adding it in TickComponent.



FBodyInstance* BI = UpdatedPrimitive->GetBodyInstance();
if (BI)
{
	UE_LOG(Midair, Warning, TEXT("RegularTick"));
	TickCount = 1;
	BI->AddCustomPhysics(FixedTick); // delegate binds FixedTickComponent
	// FixedTickComponent(DeltaTime, BI); // if we wanted to run this in regular tick
}


We just upgraded to 4.14. I’ll take a look at PhysSceneStep, but that’s 4.15 only? What code should I look at in the 4.15 branch if so? Thanks

There are new delegates for 4.15 that don’t need bodyinstance/custom physics. PhysXVehicles plugin from the engine uses them, so you could take a look at how they do it for that or a look at my test project here: GitHub - 0lento/UE4-FixedTimestepDemo: Example project for Fixed Timestep UE4 fork but do take into account that my testscene/fixed timestep fork is still WIP, haven’t had time to clean it up / fix potential issues from interpolation setup. Also do note that test project doesn’t unregister the delegates atm but you should get some idea how it’s wired up. I’ve also added some extra feats in, like collision event dispatching which can be handy when running physics code on substeps.

Your code look correct, no idea what’s going on with that. It’s possible that my fork could have similar issues too but I haven’t really noticed anything like that on my tests. Basically when you look what PhysScene and PhysSubstepTasks do on substeps, if I remember right, with subtime 0 it shouldn’t even run physsubsteppers.

The problem is that it doesn’t run because subtime is 0, and so Targets doesn’t get cleared. I’m going to do some more testing though. Now that I fixed the bAllowSubstepping issue this is the only major bug left. This code might be different in 4.15 to support these delegates, or yeah maybe you have this issue and haven’t noticed it. It was only bad at high framerates with low physics tick rates. But I’m doing this for networking purposes so it needs to be consistent in all environments.

EDIT: And yeah I just tested removing the AddUnique requirement for Targets i mentioned earlier and its all wonky at 120 fps again.