Just as easy and demanding as unbiased raytracing and GI in the graphics world - real time is not possible on current hardware. To make it into games you would have to do heavy optimization (read: extreme simplification of real world behaviour), like simple 1st order reflection only for each sound source combined with a simple generic reverb algorithm based on either delays or short convolution.