Hardware Optimisation & Benchmarking Shenanigans

Hi
As we all know taking as many good pictures as possible is the best way to ensure things go well, and ultimately will save lots of time back in the office.
However time is allways against us. As fast as Capturing Reality is, waiting is inevitably part of the game. And it can be a long time, days+ to only find out your photo shoot was not good enough.

hardware requirements are listed here https://support.capturingreality.com/hc/en-us/articles/115001524071-OS-and-hardware-requirements however it is very vague. From experience, with rendering & video encoding etc, a badly configured system which looks great on paper can perform 1/2 as fast as a similarly priced system with carefully selected components and optimised appropriately. Throwing more money at the problem is not always the answer. And at times can slow things down.
Various stages of the calculations, stress different parts of the system. However to what amount I am struggling to figure out. how can I/we optimise a system that will perform the best with the software.

I have recently got rid of my dual Xeon v3 28 core workstation, which for rendering was awesome, however in reality capture it was painfully slow. A much higher clocked, new architecture consumer  skylake system is not hugely different in reality capture yes a little slower, yet 4x+ slower for rendering (cinebench).  cost 5x+ less and has 4 vs 28 cores.

Below are the areas which I know can make a difference. Unfortunately as with many things we cant have our cake and eat it. Cost has a big influence and also technological system restrictions mean you can have 32gb of very fast ram, or 128gb+ of slow ram. You can have a 5.2ghz 6core cpu or a 3.6ghz 16core cpu.

  • Cpu speed Mhz (IPC) - More is better always.

  • Core count (threads) more is better - to a extent, and not at the cost of IPC. From my experience a dual cpu system worked awesomely with some applications, however the system architecture did not agree with other applications such as capturing reality and underperformed. I have a feeling that in the same manner as GPU’s you get increasingly mitigating returns with reality capture when increasing core count, even if the cpus are maxed 100%.

-CPU - Instructions support (AVX etc) does reality capture take advantage ? or will it soon and to what extent ? I see you are looking for a avx coder. Has avx been enabled when the software is compiled or tested ?
I personally am wishing to build a new system, AMD offer good value CPU’s at last with Threadripper and EPYC, however do not support AVX well at all. It would be a disaster to invest in the wrong architecture. I am aware that amd hardware does not perform ideally with other software due to lack of AVX. Is this/will this be true with capturing reality ?

-GPU count - 3 is max and as with most things, you get diminishing returns.

-GPU speed/cuda performance - 1080ti/titan/quadro etc are the goto cards with ti’s being best bang for buck. The new Tesla V100’s are compute monsters with a cost to match. Soon* we should have the consumer volta titans and gaming cards available.

-GPU memory, is 12gb enough ? Reality capture complains I do not have enough video memory frequently - maybe this is a bug, as my monitoring software says only around 1gb of memory is used.

-RAM amount - Reality capture is fantastic that in theory it doesn’t require massive amounts like competitors, however it does have it’s limits. - what impact does maxing the ram, and requiring swap file usage on performance have ?
I have encountered out of memory in reality capture many times, is throwing more ram at the system the best solution?

-RAM Speed, 2666mhz or 4400mhz ?

-RAM Latency - ties into the above, some apps love faster speed or tighter timings. ? from my experience, optimising cache and memory performance for cpu/ram can double the speed of certain applications. has this been tested ? - there sure is a lot of data being passed about.

-HDD for cache/virtual memory. latency vs speed. I expect this is less important, however every bit will count to a extent. I assume when ram limitations are hit this becomes more valuable.

From all the above it’s easy to choose the best, but you can’t you’l have to sacrifice one area to get the max performance in another.
So the solution
Benchmark datasets - I searched the forum and found others have mentioned the availability of a benchmark and even stated they will be creating one, however this was a year+ ago and nothing came of it.

Unless a integrated benchmarking tool is to appear in the software very soon (would be best) I propose to do the following.
Have 2 different datasets available to run to reflect varying workloads.  (I can make some, - we could utilise data provided capturing reality, or maybe someone can suggest something suitable)
a) light dataset - will be fast
b) Heavy dataset - will take longer, however may give more accurate results.

Users will then shift start the application, and hit start. Theoretically everyone should be on the same level.

Users will be required to upload the contents of the logs created to either the forum thread, or ideally a google form I create.

The easy part - RealityCapture.log this is basically a duplicate of the console window and logs the timestamps for the various stages that complete. it should be located here: c:\Users\USER\AppData\Local\Temp\
It pumps out the following as a example.

RealityCapture 1.0.2.3008 Demo RC (c) Capturing Reality s.r.o.
Using 8 CPU cores
Added 83 images

Feature detection completed in 11 seconds
Finalizing 1 component
Reconstruction completed in 31.237 seconds
Processing part 1 / 5. Estimated 1225441 vertices
Processing part 3 / 5. Estimated 38117 vertices
Processing part 4 / 5. Estimated 926526 vertices
Processing part 5 / 5. Estimated 538277 vertices
Reconstruction in Normal Detail completed in 232.061 seconds
Coloring completed in 30.105 seconds
Coloring completed in 0.116 seconds
Coloring completed in 30.363 seconds
Creating Virtual Reality completed in 294.092 seconds

The trickier part- system analysis. There is a nice little freeware tool called hardwareinfo, that does not require installation. and can spit out a nice little text report as below, It contains no sensitive info. These two logs combined I believe will contain all the required information needed for us to compile a nice comparative dataset. When I say we I mean me, I’ll have to parse the data onto a google spreadsheet which will do the calculations and we can all see the results.

CPU: Intel Core i7-6700K (Skylake-S, R0)
4000 MHz (40.00x100.0) @ 4498 MHz (45.00x100.0)
Motherboard: ASUS MAXIMUS VIII HERO
Chipset: Intel Z170 (Skylake PCH-H)
Memory: 32768 MBytes @ 1599 MHz, 16-18-18-36
Graphics: NVIDIA GeForce GTX 1080 Ti, 11264 MB GDDR5X SDRAM
Drive: Samsung SSD 850 EVO 500GB, 488.4 GB, Serial ATA 6Gb/s @ 6Gb/s
Sound: Intel Skylake PCH-H - High Definition Audio Controller
Sound: NVIDIA GP102 - High Definition Audio Controller
Network: Intel Ethernet Connection I219-V
OS: Microsoft Windows 10 Professional (x64) Build 15063.674 (RS2)

 

I’ll need your help :slight_smile:

A) input from my wall of text above.
B) Suggestions on the proposed benchmark & setup.
C) To run the benchmark and post the results.

If you’ve read through all that and think - “yeah, I’d spend 15 min running the test files and report back”. Please say
If you’ve read part of it and fell asleep thinking , “aint nobody got time for that” - Please say :D.

What we get out of out all this?
Eventually when/if enough people with varying hardware post the results. We can determine what areas to spend our precious money on to improve the areas of capturing reality that we are bottlenecked in. Which components and configurations - help with say reconstruction or texturing the most, and what hardware is just ineffective.

What say you ?  Do you think this is a worthwhile task and should I proceed ?

Hi Gotz the hwinfo tool mentioned above can also generate log files of various system resource usages over time. Which can then be plotted to graphs. Going deeper than that are profiling tools but this starts to get too complex and is really of use to the coders of the software.

The new windows fall creators update released today now also finally monitors gpu usage in task manager along side ram and cpu.

I am yet to test.

oh, that is nice - do you know by chance if that’s also true for dinosaurs with win7?

I have hwinfo installed and use it frequently, so that should not be a problem - I must have missed it in the depths of your über-post!  :slight_smile:

Awaiting your image set! Maybe we could start with a small-ish one to iron out the chinks?

Something else to concider:

RC seems to have a randomness in the alignment process, which means the results can vary, the amount depending on the image set. So I guess it would make sence to run the alignment more than once after deleting the older component…

I can possibly provide a very small dataset of only 36 pictures for an object that should all align into a single component using the default settings for RC.

Sounds like a good start!

I guess it would make sense to ask RC if they can host those images so that people can download them any time if they want to.

Thanks Shadow & Gotz

I wonder if this would be suitable, and I also wonder if the CLI dataset they have available, may be more suitable, and even be possibly accessible to us.

Hmm, has anyone looked at those yet? Are they practical?

sounds interesting.

I think its important to benchmark each stage. for alignment, reconstruction part 1 ie depth maps (gpu), part 2 creating model (cpu). then texturing. and maybe even simplify etc…

I’m mostly interested in seeing part 2 of reconstruction. since its by far longest part for my scenes.

but I’m not sure how useful seeing a few hundred photos will be. you won’t see how ram or ssd’s really effect speeds it all until you get into 2500 - 5000 photos. that will take a long time to benchmark though but would be more interesting.

this sounds like it could all be done with cli scripts and the demo version.

 

You are correct chris - gaining data from each stage is important.  

Do you think CLI scripts can work with the demo ?  If so that would definitely be the way forward.

I’m pretty sure all the cli commands except export work on the demo version.

I have no idea if all the right logs can be saved. i would assume so.

it would have to all be done in one go though, as you can’t saves any of the parts.

maybe it worth seeing if all the right information is saved in the logs by just running the start button.

Hey guys,

why would it be neccessary to use CLI? I think we are all motivated enough to search out all the log files and copy-paste the info into a table, right? And I’m not sure what would happen if we install the demo parallel to the normal versions. I don’t think I’d be willing to mess around with that…

Chris, you are probably right that a larger image set will give us better or at least different results. But I still think it would be important to try and optimize the method with a smaller project first, just so nobody gets distracted by super long processing times…

Hey guys,
that is a great initiative. Thank you very much. We like it very much in CR and an benchmarking tool has been on our minds for a long time. We want to support you in this initiative as much as we can.

There are more ways how to do that.

One of the way is : In the workflow tab in the settings here is “Progress End Notification”. Read the application help for the detailed information. You can attach a bat file to the notification that will do the job you need. Somebody needs to make the scripts etc.

I think that the CLI is the way as it work in the demo. It can works as follows:

  1. Clear cache
  2. Export and backup global settings “-exportGlobalSettings”
  3. Import your global settings “-importGlobalSettings” that will include also the “Progress End Notification” hooks.
  4. Run the tests

I think that the cache clearing as well as the identical global settings are the most important so that eavrybody will have the same starting conditions.

As you mentioned already the dartaset is also very important. I would recommend to use a dataset of 300-500 images of ~10Mpx resolution of some relativelly complicated structure, however, as it will be exposed publicly, it should have a wow factor. We can try to find some of our but it can take some time. If you have some the go and use it.

Hey Michal,

thanks for the encouragement!

Is it possible to install a demo parallel to a “proper” license?

I don’t have CLI capabilities here on my system…

 

Ivan, we’re waiting for you now!   :wink:

Hi Michal

Many thanks for the supportive words.

I was unaware of the progress and notification was able to do that,  I shall take a good look at that and read up on scripting required.

I agree that the dataset should have a wow factor as it will encourage people to use it widely, and have a positive impact on the overall image of the software.  Something impressive is required, visually and technically.

  • Suggestions:  What do people think would be best ?

Humans ( would need someone with a multi rig setup to donate the data),  Getting a really good capture appears difficult.

Architecture, internal or external.  There are some very beautiful buildings about that are accessible, which can have very nice intricate details textures and features.   My testing has had some awesome results.

Ideally a combination of Nadir(above) and ground images makes for the best solution.   Which I have found to be tricky as uav’s and local authorities don’t always mix nicely :). 

Statues/Monuments can work well and can look nice.

heritage items/scenes usefully have fantastic detail and textures.

Maybe something organic could be good.  

The subject matter is endless…

 

Another option I was also thinking of was a test scene setup (example from dpreview), with various objects/models placed on it.
Similar in a way to this, however totally 3 Dimensional  and a lot more visually interesting.  

Interesting objects could be arranged from a variety to cover most subjects people would be interested in.
Such as a lobster shell,  some wood & stones /leafs , highly detailed miniature architectural models, engineering components etc, and 3D measuring guides.

If created well.  Not only could the benchmark scene be visually impressive and be used for testing performance.  It could also be used to compare the effect on changing settings/quality/accuracy etc in a measurable and controlled way.  Even between software revisions.  Photographing it would have the bonus that images taken could be very good and accurate due to ‘studio’ conditions.  Avoiding occlusion between objects could be very tricky unless well planned out.   Carefully selecting the various components would also take some thought.

Perhaps I’m over complicating the situation. ?

Anyones input and suggestions are greatly welcomed.

An object like the one shown in the linked video may be something that has the required WOW factor.

https://www.youtube.com/watch?v=NWuJPENRQCU

 

I have tried to do a reconstruction from that video and it turned out absolutely amazing despite the relatively bad quality of the source material.

I agree with ivan’s second to last sentence!  :smiley:

For a first try, I wouldn’t overcomplicate it and take something that is already there.

Like ShadowTail suggested at the beginning with 36 images.

It’s about working out the systematic approach.

Wow can come later.

ShadowTail, I like the object, but how is it about the copyright?

Also, wouldn’t a round object (as in shot in closed circles) be better for the beginning, since there are less problems at the (non existing) borders?

Relief carvings etc, as Shadowtails example are indeed impressive works of art, and the software does a fantastic job of extracting the depth from them.  However I don’t think it shows the true ability of the software, and could give the impression it is about 2D depth mapping and not full 3D scene/object recreation.  

The end result needs to be technologically impressive as well as visually complex and interesting.

Using someone else’s work is definitely no go from the start.
We definitely will be able to create our own data.

To ensure the software is not being held up with strange issues, knowing the conditions of how the data was captured is important.  Full Exif data etc.

Finding a versatile and impressive subject/s that will capture the essence of what the software is capable of shouldn’t be too hard with a bit of brainstorming.  I do not believe a small dataset of say 36 images will be enough to stress systems realistically to gather the data we require for a benchmark.  Nor will produce a adequately high quality result to represent what is possible.  It is very impressive what can be done with a few images,  it is even more impressive what can be done with a larger number taken carefully. 

Gotz’s point of closed circles makes sense, and having a full scene would be nice.

With regards to over complicating :)  It is indeed a good idea to be able to walk before you can run.   
That said, I feel if a job is worth doing, it’s worth doing properly.    If rushed and poorly thought out you don’t achieve what you set out too, and end up with sub par project that is lacking in many areas.  A good balance is important.  If you bite more than you can chew there is the risk things do not get completed.

I’ll have a ponder over the next few days.  Keep the suggestions coming :slight_smile:

 

Hey ivan,

good point about a worthwhile project.

I just wouldn’t want this project to peeter out because nobody gets around to doing a project with 500 images just so.

So I would say we can do it parallel - looking for a nice image set that already exists and the result is known, and whoever feels like it can go out and shoot to impress!  :slight_smile:

Michal, how about the stag in your showcase - would that be suitable/available?

I’m faraid that the stag will not be possible. I’ll try to find something and if not then grabbing a camera and capturing any tree trunk can be a start :slight_smile: