The reason behind is an attempt to scale our workflow better.
Currently we use expensive machines with GPUs and have a python script which runs every step in RC, from align to export, managing the input parameters, retrying if necessary, etc.
However, since it is running on a single machine that means we are using expensive GPU but not utilizing them. According to our research we need GPU only during the reconstruct-unwrap-texture.
Also, as soon as project is finished it is deleted from the machine so we can’t retrigger any particular step later on.
Instead, we were thinking of running it in k8s, with 2 pools: GPU and CPU-only machines, with different specs. Then, using any workflow tool (f.e. Argo) we can schedule, orchestrate and manage our workflows having the best resource utilization, so at the end we expect it to become faster and cheaper.