Time depends on the project and what you want to get as an output, not only on the pictures count.
What I would do :
-generate 20% to 25% overlapping clusters of ~1,000 pictures. You can do squares of your terrain. You will have around 800 clusters to process for each group of 640,000 pictures (approx.).
-process them individually (first manually to test, then with the CLI) and export registration for each of them. It may be hard to know exactly the time needed to process 1 cluster (it depends on many factors related to the number of features the software can extract from your pictures) but I would say less than half an hour each on a high end machine. So maybe you will need 400 hours (16+ days) to align all your pictures. It could be more or less, it really depends on your data. 5 MP is not much, so maybe I overevaluate the time - maybe 10 minutes or so will be enough. Just run a test.
-Then you import all your registrations (the 800) in one project and you align again to (try to) get one only alignment. It should take some time, maybe days - once again, hard to tell without some testing.
-Then you reconstruct your model. Depending on the level of precision, you have a choice to make between fast reconstruction and accuracy (with the downscale for depth map setting).
-Then simplify, cleaning and texture.
Hope this helps.