New NVidia Cards

From observation of files accessed it looks like up to 69% RC feeds your images into the GPU and stores the results in small chunks in the cache then 70% and up it feeds those cache files back into memory to be aggregated into larger chunks. 

Copy is data going in and out of the GPU and is still being used. I thought about this part a lot as I don’t see any particular bottle neck (HDD, SSD, GPU, CPU, ect…) and just assume this must be due to this part of the calculation must be done on both the GPU and CPU. It might also be that this computation is not very parallel processing friendly, and one component must wait for the results from the other before it can continue. Don’t take this too literal as only the RC devs could know for sure. 

Task manager is a good way to see what your bottle neck in you system is at a particular point in the process. Is the CPU/GPU pegged close to 100% or is the HDD with the cache on it at 100% with low CPU or GPU utilization.