Yes, getting the children can be a bit complex, because it so depends on what the tree is.
It should be easy to ask “Does Box have a Box?”
In the case that Box1,2,3 might be child actor components, Box0 might return the Child Actor Component as the item, which you must then request the Actor reference it is holding, otherwise the next question will be “Does childactorcomponent have child?” - which returns none.
In the case that Box is a widget, that becomes a bit difficult to answer.
But instead of relying on scene hierarchy, you could make an interface that asks the Object (Item) to return Objects (children). Having the Box class implement this interface, you have better control of how and which children are returned. I recommend this method, interfaces are quite easy to use once you read up on them.