Using a structured buffer peculiarity

I was fooling around with ray tracing and created two shaders. First is raygen, basically does the tracing and writes the payload into a structured buffer. Second is a compute shader, it takes this structured buffer as input, gets material properties from payload and creates a texture similar to unlit scene. Very basic. I noticed that the RGS takes about 2ms per frame, CS takes about 0.3ms, measuring is stat GPU. In my curiosity I tried to figure out which part of my raytracing code is the slowest commenting out different parts of RGS code. I ended up commenting all tracing out, with RGS just writing an empty payload into the buffer with no computation at all and it is still taking about 2ms. If I understand things correctly, the data is created on GPU, lives there, and dies there, so it should not be CPU-GPU transfer. What is it then, buffer allocation on GPU that takes 2ms? Is there any way I can reduce this time?