You can check the necessity by setting an imitation of a bad Internet connection in the project settings. If, with a ping of 100-300 ms, the result suits you, then, in my opinion, you should not unnecessarily complicate the logic, but most likely it will look very jagged.
In this case, the algorithm you described with prediction of the result from the server and subsequent confirmation is what you need.
The response from the server must contain a reference to the actor to which the object that was raised should be attached, the name of the socket or local coordinates.
If the reference is not valid, then you need to unpin the object and set its world coordinates to those received from the server.
I’m not sure if this is an ideal solution, but that’s what I would do. At least for starters.