Hi Tim,
Sorry for the late response and thank you for sharing your findings. In the latest version, the number of tasks created in UpdateSurfaceCachePrimitives() is limited to the number of task workers which should significantly cut down the number of tasks and therefore reduce the number of reallocs needed during the loop. That’s said, preallocating the needed space should improve things further. The partial sort also seems like a no brainer. Thanks again for sharing your optimizations.
-Jian