We experience a crash triggered by the check in the Instance Scene Data Buffers located around here: `Engine/Source/Runtime/Engine/Public/InstanceDataSceneProxy.h:246`
FWriteView BeginWriteAccess(FAccessTag AccessTag)
{
check(AccessTag.Kind == FAccessTag::EKind::Writer && AccessTag.WriterTag != 0u);
After investigating the core dump, it appears that a valid `this` pointer was hashed with a result of 0.
[Image Removed]
The pointers value was `0x000001d000000000` which if fed into the pointer hash function
[[nodiscard]] inline uint32 PointerHash(const void* Key)
{
// Ignoring the lower 4 bits since they are likely zero anyway.
// Higher bits are more significant in 64 bit builds.
const UPTRINT PtrInt = reinterpret_cast<UPTRINT>(Key) >> 4;
return UE::Private::MurmurFinalize32((uint32)PtrInt);
}
appears to result in a 0
[Attachment Removed]
After some internal review, we’ve decided to rewrite the pointer hash to this:
[[nodiscard]] inline uint32 PointerHash(const void* Key)
{
const UPTRINT PtrInt = reinterpret_cast<UPTRINT>(Key);
const UPTRINT Low = PtrInt;
const UPTRINT High = (PtrInt >> 32);
const UPTRINT Combined = Low ^ High;
const UPTRINT NullGuard = Combined | (Key != nullptr);
return UE::Private::MurmurFinalize32(static_cast<uint32>(NullGuard));
}
We think this should effectively incorporate the full pointer and due to the NullGuard computation should ensure a non 0 hash for any non null ptr. It does add a few extra instructions to the compiled function (on gcc at O3 went from 13 to 18).
Does that seem like a reasonable change and should we expect any issues from this?
[Attachment Removed]
Hi Brian, this was fixed in CL 49601174.
File: UE5\Main\Engine\Source\Runtime\Core\Public\Templates\TypeHash.h
namespace UE
{
namespace Private
{
[[nodiscard]] inline uint64 MurmurFinalize64(uint64 Hash)
{
Hash ^= Hash >> 33;
Hash *= 0xff51afd7ed558ccdull;
Hash ^= Hash >> 33;
Hash *= 0xc4ceb9fe1a85ec53ull;
Hash ^= Hash >> 33;
return Hash;
}
[[nodiscard]] inline uint32 GetTypeHash64(uint64 Value)
{
return (uint32)Value + ((uint32)(Value >> 32) * 23);
}
}
}
[[nodiscard]] inline uint32 PointerHash(const void* Key)
{
const UPTRINT PtrInt = reinterpret_cast<UPTRINT>(Key);
return UE::Private::GetTypeHash64(UE::Private::MurmurFinalize64(PtrInt));
}
template <
typename ScalarType,
std::enable_if_t<std::is_scalar_v<ScalarType> && !std::is_same_v<ScalarType, TCHAR*> && !std::is_same_v<ScalarType, const TCHAR*>>* = nullptr
>
[[nodiscard]] inline uint32 GetTypeHash(ScalarType Value)
{
if constexpr (std::is_integral_v<ScalarType>)
{
if constexpr (sizeof(ScalarType) <= 4)
{
return Value;
}
else if constexpr (sizeof(ScalarType) == 8)
{
>>>> return UE::Private::GetTypeHash64(Value); <<<<<
}
else if constexpr (sizeof(ScalarType) == 16)
{
...
}
Richard
[Attachment Removed]
Hi Brian,
You’re absolutely right.
Here’s what I’m about to submit:
[[nodiscard]] inline uint32 PointerHash(const void* Key)
{
const UPTRINT PtrInt = reinterpret_cast<UPTRINT>(Key);
uint32 Hash = UE::Private::GetTypeHash64(UE::Private::MurmurFinalize64(PtrInt));
if (Hash == 0 && PtrInt != 0)
{
Hash = ~0u;
}
return Hash;
}
This guarantees PointerHash() returns non-zero for non-null pointers ( Algorithm unchanged, nullptr still hashes to 0 ).
Richard
[Attachment Removed]
Another solution is to simply change the ctor of FAccessTag and force ~0u when receiving a write tag == 0
FAccessTag(uint32 InWriterTag) : WriterTag(InWriterTag ? InWriterTag : ~0u) , Kind(EKind::Writer) { }I’ll come back here with the official fix.
[Attachment Removed]
I went ahead with the last proposed fix (submitted in our main branch : CL 51967511).
[Attachment Removed]
I think we are going to try using this version instead as it gives better quality hashes and is only marginally slower than always or’ing the key test in.
[[nodiscard]] inline uint32 PointerHash(const void* Key)
{
const UPTRINT PtrInt = reinterpret_cast<UPTRINT>(Key);
const uint32 Low = static_cast<uint32>(PtrInt);
const uint32 High = static_cast<uint32>(PtrInt >> 32);
const uint32 Combined = Low ^ High;
const uint32 NullGuard = Combined + (Low * (Combined == 0));
return UE::Private::MurmurFinalize32(NullGuard);
}
Based on some testing with LLVM’s MCA we found the following results
Orig OrNotNull AddIfNeeded
Iterations: 100 100 100
Instructions: 1300 1800 1800
Total Cycles: 454 583 619
Total uOps: 1300 1800 1900
Dispatch Width: 4 4 4
uOps Per Cycle: 2.86 3.09 3.07
IPC: 2.86 3.09 2.91
Block RThroughput: 3.3 4.5 4.8
[Attachment Removed]
If I’m reading the CL correctly, it looks like some valid pointers will still hash to zero with the official change? I brute force scanned and found `0x16095E240ULL` for example which seems like a plausible memory address.
#include <iostream>
typedef unsigned long long uint64;
typedef unsigned int uint32;
[[nodiscard]] inline uint64 MurmurFinalize64(uint64 Hash)
{
Hash ^= Hash >> 33;
Hash *= 0xff51afd7ed558ccdull;
Hash ^= Hash >> 33;
Hash *= 0xc4ceb9fe1a85ec53ull;
Hash ^= Hash >> 33;
return Hash;
}
[[nodiscard]] inline uint32 GetTypeHash64(uint64 Value)
{
return (uint32)Value + ((uint32)(Value >> 32) * 23);
}
int main(){
std::cout << GetTypeHash64(MurmurFinalize64(0x16095E240ULL)) << std::endl;
return 0;
}
[Attachment Removed]
Just as a warning, that is adding a potential branch to a pretty hot code path. I tested on GCC and it wasn’t able to optimize it out at O3. I’d suggest doing something like my original hash change
const uint32 LowerBits = static_cast<uint32>(Value);
const uint32 UpperBits = static_cast<uint32>(Value >> 32);
const uint32 FoldedBits = LowerBits ^ UpperBits;
// If folding collapses to zero, fall back to the lower bits.
return FoldedPointerBits + (LowerBits * (FoldedBits == 0));
[Attachment Removed]