We have a brand new workstation with an AMD 3990X. We’ve installed the latest windows 10 pro. Unlike the previous version Windows sees correctly only one socket and identifies the 64cores/128 threads. But RC uses only 65 out of the 128 threads for calculation.
I have to mention that I’m currently processing more than 15k images + scans (not a small workload…).
We have another workstation with the 3970 (64threads) that is fully used by RC.
Would you have any suggestion? On the forum I’ve read that RC has no CPU core limitation.
Yes, indeed RC is able to use the whole potential of your hardware, though this seems like a configuration issue, check your bios, the internet for same issues and try to give the process of RC bigger priority and affinity in the task manager.
Other softwares are able to use all 128 threads (especially benchmarks like cinebench or blender), so it’s not from the bios. We’ve checked the internet of course. We’ve changed the RC process priority, but we don’t see any change.
“128 threads—oh my! Some operating systems and applications can’t really cope with this scale yet.”
This one goes in depth.
"AMD recently released its monstrous Threadripper 3990X. It’s a 128-thread CPU with a Max boost clock speed of 4.3GHz. It’s an extremely powerful chip, but to get the most of that power, you’ll need to upgrade to something above Windows 10 Pro. In Anandtech’s review of the Threadripper 3990X, they noticed that systems running Windows 10 Pro split the CPU into two groups of 64 threads.
When split into two groups, a PC can still use all 128 threads, but a single program cannot. When a program runs, it goes into one of the processor groups. The system places the program in the group that’s less busy to avoid congestion. A program can only use the threads within the group it’s placed in, meaning that it’s capped at 64 threads. Developers can make programs processor group aware, which gets around this issue, but if a program isn’t built for this type of setup, it can only access 64 threads.
You can also somewhat get around this issue by disabling simultaneous multithreading. This will make it so your system only has one processor group with 64 cores. The downside of this is that some performance is lost, according to Anandtech’s benchmarks.
The real way to get around this is to run an operating system built to handle this many threads. Microsoft didn’t build Windows 10 Home or Windows 10 Pro to handle 128 threads. That’s reasonable considering the vast majority of systems running those two versions of Windows will never see anywhere near 128 threads.
If you want to utilize the full power of all 128 threads while running Windows, you can run Windows 10 Pro for Workstations or Windows 10 for Enterprise.
We are already running win 10 for workstation on that machine, for this reason. (Although AMD has communicated on it, saying that since the windows 10 v.build 18362.592, there is no problem and windows 10 pro works as well as windows 10 pro for workstation or enterprise versions) https://www.tomshardware.com/news/amd-threadripper-3990x-performance-windows-10-enterprise
Again, we are aware of these observations, and we have made sure to eliminate these possibles causes. The system identifies now 1 socket, 64 cores, 128 threads. Other software don’t have any problem using all threads.
Couldn’t it be a limit from RC?
Sharing our screen and discussing directly would save a lot of time for both of us, if RC team is capable to really help us. This would definitely not be an excessive service considering the licence costs (we’ve been RC customers for more than 3 years now).
Well unfortunately, it will not be possible for us to support the full use of the 64 cores. The algorithm would not take it due to the use of visual c++. What you can although try, is to use 2 instances and split your CPU usage between them so each could use the 32. I cannot ensure you that this will work, but it should.
As you might already know RC can be launched up to 4 times using RC instances. You can open RealityCapture 2 times and set half of your threads to that process and second half to another one using Task Manager. We´ve tested this on a virtual machine in the past and it was working - however, this is not something we recommend doing in general. This is for those seeking for max performance with a little risk or a need for potential troubleshooting. It might be still worth a try, but you do so completely at your own risk!
Number of threads: …
In Visual C++, for a non-nested parallel region, 64 threads (the maximum) will be provided. [RP: note that even in the case of nested parallelism, total max threads for the process is 64]. OMP_NUM_THREADS environment variable: …
In Visual C++, if the value specified is zero or less, the number of threads is equal to the number of processors. If the value is greater than 64, the number of threads is 64.
So basically it is a limit in alignment and other processes that use Visual C++.
I hope my work here was helpful to you and we are glad you are using RealityCapture and hope that you like it apart from this issue!