CASSCPP-3 Crash on Pure virtual function while the destructor of the Session object is called#583
CASSCPP-3 Crash on Pure virtual function while the destructor of the Session object is called#583absurdfarce wants to merge 1 commit intoapache:trunkfrom
Conversation
Also avoid the callback in shutdown situations in order to avoid taking longer than we need to.
| } | ||
|
|
||
| void PrepareHostHandler::on_close(Connection* connection) { | ||
| callback_(this); |
There was a problem hiding this comment.
Long-term the better answer is probably to not make HOST_UP/HOST_READY/HOST_ADD event distribution a function of the callback in use here. Those ops should be handled by logic in the relevant functions of cluster.cpp, specifically as some kind of "after" action from the prepare op. But this fix allows us to address the immediate problem without having to refactor the entirety of message delivery.
There was a problem hiding this comment.
Thanks for sharing the context. Shall we comment the todo in the code?
|
@yifan-c If you have a sec I'd love a review of this one! |
|
IBM Jenkins run for this change was clean. Well, there were some weird failures around cloud connectivity for Rocky Linux 9 only but I'm quite sure those were environmental/runner issues and not anything related to this change. |
| * @param connection The closing connection. | ||
| */ | ||
| virtual void on_close(Connection* connection) = 0; | ||
| virtual void on_close(Connection* connection) {}; |
| } | ||
|
|
||
| void PrepareHostHandler::on_close(Connection* connection) { | ||
| callback_(this); |
There was a problem hiding this comment.
Thanks for sharing the context. Shall we comment the todo in the code?
Multiple steps to try to keep us out of the crash situation. First: remove the call to the callback from the handler's on_close() function. It's not especially useful when a connection is in the process of closing down like this; when that happens host up or host add events (the objective of the callback in question) should be considered unreliable anyway. This change alone is probably enough to fix the problem but it's effect there is entirely one of timing; by not having to handle the extra event delivery logic we're not taking as long in the callback so there's less opportunity to overlap with other resources being destructed. That said, minimizing the amount of work here is still a pretty good idea.
Second: don't make the on_close() function of the handler a pure virtual function. There's no obvious reason this function has to be a pure virtual function so the hope is that by giving it a (no-op) body we'll just execute that if we wind up with a conversion from a PrepareHostHandler to a ConnectionListener object while the event loop is shutting down.