How should I cut this environment up for streaming without the pop-in being obvious?

Apologies for the silly question, but I’m working on a super-crude greybox mockup of an urban environment so I can figure out the exact streaming and rendering pipeline I’m going to use. Here’s an overhead shot, which as you can see consists almost entirely of boring square buildings laid out in a grid:

The reason I’ve made it so boring is because the long, straight streets illustrate my central challenge: the player is going to instantly see when a faraway building disappears, even if it’s nowhere near their location. I can partially mitigate this by cleverly arranging streets and props to minimize long sightlines, but it’s prohibitively difficult to make a convincing-looking city without letting a street go for more than a block without turning sharply.

So how should I be approaching this? It’s not really a good use case for world composition, but with simple level streaming it seems close to impossible to hide the fact that nothing more than a half block away stays in memory for long.

Because it is really impossible to eat cake and have cake. All you can do is making heavy use of LOD and making seen (pop in) distance much bigger.
Create multiple LOD versions of buildings, display street as far as you can, then live with distant poping.

That makes sense… is there anything approximating a golden rule to decide how large a given chunk should be? It’s a bit chicken and egg, where it’s tough to create a definitive poly budget without final assets, but the fidelity of those assets really depends on how much we’re able to cram into a single level while remaining performant.