Interesting question. There is a node that uses distance for crossfade so making a new one to delay by distance may not be too difficult.
I honestly have no solid understanding of the sound system inner workings but i would be on the lookout for how/when sound nodes are run to verify it gets accurate location data on single and multiplayer.