MetalCommandBufferFailurePageFault 闪退求助

[2025.06.10-05.07.30:097][975]LogMetal: Warning: |MetalCommandList.cpp:57|<AGXA13FamilyCommandBuffer: 0x13e13e700> label = <none> device = <AGXA13Device: 0x1143aa180> name = Apple A13 GPU commandQueue = <AGXA13FamilyCommandQueue: 0x114398180> label = <none> device = <AGXA13Device: 0x1143aa180> name = Apple A13 GPU retainedReferences = 0 [2025.06.10-05.07.30:097][975]LogMetal: Warning: |MetalCommandList.cpp:173|Command Buffer Unknown Failed with PageFault Error! Error Domain: MTLCommandBufferErrorDomain Code: 3 Description Caused GPU Address Fault Error (0000000b:kIOGPUCommandBufferCallbackErrorPageFault) Unknown Unknown

重现步骤
`项目组外发体验服偶现如下闪退,闪退记录的信息如下

[2025.06.10-05.07.30:097][975]LogMetal: Warning: |MetalCommandList.cpp:57|<AGXA13FamilyCommandBuffer: 0x13e13e700>
label =
device = <AGXA13Device: 0x1143aa180>
name = Apple A13 GPU
commandQueue = <AGXA13FamilyCommandQueue: 0x114398180>
label =
device = <AGXA13Device: 0x1143aa180>
name = Apple A13 GPU
retainedReferences = 0
[2025.06.10-05.07.30:097][975]LogMetal: Warning: |MetalCommandList.cpp:173|Command Buffer Unknown Failed with PageFault Error! Error Domain: MTLCommandBufferErrorDomain Code: 3 Description Caused GPU Address Fault Error (0000000b:kIOGPUCommandBufferCallbackErrorPageFault) Unknown Unknown

闪退堆栈如下

1

UAGame

0x000000010a2d1494

FIOSPlatformMisc::GPUAssert() (Runtime/Core/Private/IOS/IOSPlatformMisc.cpp:1697)

2

UAGame

0x000000010a11d514

ReportMetalCommandBufferFailure(mtlpp::CommandBuffer const&, char16_t const*, boolean) (Runtime/Apple/MetalRHI/Private/MetalCommandList.cpp:174)

3

UAGame

0x000000010a13cef8

MetalCommandBufferFailurePageFault(mtlpp::CommandBuffer const&) (Runtime/Apple/MetalRHI/Private/MetalCommandList.cpp:193)

4

UAGame

0x000000010a11c190

HandleMetalCommandBufferError(mtlpp::CommandBuffer const&) (Runtime/Apple/MetalRHI/Private/MetalCommandList.cpp:229)

5

UAGame

0x000000010a11c018

FMetalCommandList::HandleMetalCommandBufferFailure(mtlpp::CommandBuffer const&) (Runtime/Apple/MetalRHI/Private/MetalCommandList.cpp:285)

6

UAGame

0x000000010a11d5e4

__ZN17FMetalCommandList6CommitERN5mtlpp13CommandBufferE6TArrayIN2ns6ObjectIU13block_pointerFvRKS1_ELNS4_17CallingConventionE1EEE22TSizedDefaultAllocatorILi32EEEbb5FName_block_invoke (Runtime/Apple/MetalRHI/Private/MetalCommandList.cpp:323)

7

UAGame

0x0000000104c01f74

__ZN5mtlpp13CommandBuffer19AddCompletedHandlerEU13block_pointerFvRKS0_E_block_invoke (/Volumes/DataHD/Perforce/UE4/Release-4.25/Engine/Source/ThirdParty/mtlpp/mtlpp-master-7efad47/src/command_buffer.mm:280)

8

Metal

0x0000000190ce5990

MTLDispatchListApply + 52

9

Metal

0x0000000190ce5874

-[_MTLCommandBuffer didCompleteWithStartTime:endTime:error:] + 600

10

IOGPU

0x000000022ada1afc

-[IOGPUMetalCommandBuffer didCompleteWithStartTime:endTime:error:] + 212

11

Metal

0x0000000190cb0cd4

-[_MTLCommandQueue commandBufferDidComplete:startTime:completionTime:error:] + 104

12

IOGPU

0x000000022ada1900

___62-[IOGPUMetalCommandBuffer fillCommandBufferArgs:commandQueue:]_block_invoke_2 + 168

13

IOGPU

0x000000022ada1650

IOGPUNotificationQueueDispatchAvailableCompletionNotifications + 136

14

IOGPU

0x000000022ada1560

___IOGPUNotificationQueueSetDispatchQueue_block_invoke + 60

15

libdispatch.dylib

0x0000000198efb064

__dispatch_client_callout4 + 16

16

libdispatch.dylib

0x0000000198f17420

__dispatch_mach_msg_invoke + 460

17

libdispatch.dylib

0x0000000198f02428

__dispatch_lane_serial_drain + 348

18

libdispatch.dylib

0x0000000198f18174

__dispatch_mach_invoke + 452

19

libdispatch.dylib

0x0000000198f02428

__dispatch_lane_serial_drain + 348

20

libdispatch.dylib

0x0000000198f03154

__dispatch_lane_invoke + 428

21

libdispatch.dylib

0x0000000198f02428

__dispatch_lane_serial_drain + 348

22

libdispatch.dylib

0x0000000198f03120

__dispatch_lane_invoke + 376

23

libdispatch.dylib

0x0000000198f0e388

__dispatch_root_queue_drain_deferred_wlh + 284

24

libdispatch.dylib

0x0000000198f0dbd4

__dispatch_workloop_worker_thread + 536

25

libsystem_pthread.dylib

0x000000021be8067c

_pthread_wqthread + 288

想请教有没有相关的经验可以解决,或者是什么方法能快速定位,或者已经有相关结论了
内部重现概率较低,同一个地方有时候很容易出现,有时候不能出现,比较奇怪`

Hi,

从崩溃的现象上看,感觉像是某个资源used after free了,我不太确定具体原因,不过可以试试开启GRHINeedsExtraDeletionLatency,看看是否有帮助。