Hello again Teed,
I’m really interested in this (I used to do a lot of sound design and engineering). Having a good tempo system would of course be ideal, but there are other work arounds.
Let’s say your music is comprised of various phrases and passages. You could cut them up and loop them a certain number of times then move on to the next if required. If you needed to change the music to fit the context (e.g some action started or the mood changed), you could wait until the end of the loop currently playing and change the music then. Of course, depending on your loop lengths, this will have some delay between the situation changing and the music.
My first idea about cross fading (which I think you understood) was that if all the pieces are at the same tempo, they will be synced automatically as long as you start them all at the same time. You can cross fade at a speed of 1ms to make the music jump to the next piece depending on the situation. This would work if you only needed say 4 pieces within a level/environment.
If you are trying to make a music based game, these aren’t really good options. Anyway, just thought I’d share some ideas.
By the way, how accurate is ‘Real world time’?