How can I have multiple players control one pawn and show characters inside?

since the end result is multiplayer I should just jump right into that?

that’s the case, you should plan architecture with multiplayer\goal in mind from the start

player automatically posses the row boat

That fine for singleplayer, but in multiplayer your boat can’t have multiple owners. Hence, you don’t need to posses it and it can be just an actor (though it’s still better to keep it as ACharacter, since CharacterMovementComponent does a tons of work for proper multipler movement replication to clients. Note: i didn’t work with buoyancy so can’t say how it works in regards to multiplayer). As for default pawn - make some intermediate pawn, something like camera-only pawn.

players character model inside

Since you want the models, add the mesh to that default pawn. You even can attach your default pawns to boat’s sockets.

After this point, when you have a 4 controllers, 4 default camera-only pawns and a single boat:

  1. catch the input for player inside of default pawn (or stright in controller no differences in simple scenarios)
  2. when input happens - call a server-rpc (client->server call)
    note: you can send to server raw input [click happend] or click result [the force we should apply to boat]. Sending [click happend] is more cheat-proof by the cost of click being affected by latency. Sending click result [the force we should apply] is vulnerable to cheats but left better user experience. This choice is up to you, unless you going for more advanced techniques, like custom prediction, which in simple case like yours is likely excessive
  3. In this call’s handler on server - add movement input to boat on server only
  4. CharacterMovementComponent will handle automatic movement replication to all clients

Alternative scenario:
2. same as [2] above + prediction. Does the same but also add movement input locally on client. That’s the tricky technique, since there are a lot of corner cases and improper implementation may introduce the excessive rubberbanding that will be worse than the simplest way. So i just outline that such method exist without going in details.