Why is it not benefical to use spatial + temporal together?

If I use deferred rendering, why is it not beneficial to use temporal and spatial sampling together? Almost every YouTube video on the internet suggests using the combination of both.