should then of course be compared to a cast, interface, direct access in a C++ implementation.
The overhead would be a lot higher with a long hierarchy as well.
A from B from C from D from E, run function interface on A, implemented in B, C, D, E.
Compared to casting to E using a non public/shared function in E.
But as people say in 9999 cases out of 10000 will not be your problem or why your code is slow.