So I gave that profilegpu command a go, and unfortunately it does not reveal anything related to occlusion culling at all. The extra frame time somehow get distributed evenly among all stages, although shadow map does take a bigger hit than others.
I also cannot use something like Distance culling volumn, guess the ISM actor would have a very large bounding box due to the scattered nature of the instances.