Problem: I am using NVidia NSight to profile different setups in our game. When I run an optimized build, I lose visibility into the call stack that can show specifically where compute is occurring. If I profile an unoptimized build, I get this information, but the times are very different (slower of course), and I’m not sure if they scale identically from Dev to Test.
Question: Is there a way to build a Test or Shipping build that keeps information for NSight to present the hierarchy of calls and their times, and performs better than the traditional Dev builds? Might not be possible because optimization can alter control flows, but either way, I am seeking best practices for using NVidia NSight. Any … pardon the pun… insights or documentation recommended?
Thank you!