I can’t say exactly what problems are down the road ( not a physics buff ), but to get the initial concept working, you need to have collision boxes over the elements that can snap together.
The way, when one cube overlaps another, you have two actors to work with, and can do the alignment and attachment.