Skip to content

Major reliability improvement for logging stack traces#287

Open
borrrden wants to merge 13 commits intomasterfrom
crash_handling
Open

Major reliability improvement for logging stack traces#287
borrrden wants to merge 13 commits intomasterfrom
crash_handling

Conversation

@borrrden
Copy link
Copy Markdown
Member

@borrrden borrrden commented Mar 20, 2026

This took many days of effort and so it's involved a lot with various stack unwinding and address retrieval mechanisms so I'll try to provide a high level summary here. Firstly to avoid a giant spaghetti of ifdef I broke out the implementation across several files that only get compiled on certain platforms:

  • Backtrace+capture-darwin.cc: Apple specific logic for getting a symbol from a frame in a stack trace. This uses standard backtrace_symbols and dladdr since the implementation on Apple is much more enhanced than the glibc linux counterpart (Most likely the compiler is more friendly about it when it emits machine code).
  • Backtrace+capture-linux.cc: Linux specific logic for getting a symbol from a frame in a stack trace. This makes use of the GCC provided libbacktrace.a (it's present even when using clang because by default clang still uses GCC for its runtime stuff in c++). This library is very attuned to specifically getting frames on Linux even when frame pointers are omitted and handles all the gnarly unwinding stuff safely (I found when trying to implement register based stack crawling that I often got it wrong and caused an SIGSEGV while trying to do it)
  • Backtrace+capture-posix.cc: Other stuff common to all POSIX environments like unmangling and actual printing. In this we need to be very careful since my plan is to handle signals here as well. We can only use async signal safe functions. In particular this means no streams, no printf, and no unmangling either.
  • Backtrace+capture-win32.cc: Windows specific logic for getting a symbol from a frame in a stack trace. Printing is not restricted since the way that Windows faults already moves to an entirely separate context in which it is safe to do just about anything. This uses the StackWalk64 API.
  • Backtrace+signals-posix.cc: POSIX signal handlers! RAII style class that will register signal handlers and write out a stack trace, and then pass on control to the next handler.
  • Backtrace+signals-win32.cc: Win32 exception handlers! Windows doesn't really have the equivalent of POSIX signals. It has a few basic ones but that's at. Most faults result in one of the various types of exception handlers (this is an overloaded term, since Windows was using the term "exception" way before it had any meaning for c++. These "exceptions" can happen even in C. They are basically signals by another name except the way the handlers are invoked is more safe).
  • Backtrace.cc: This now encapsulates any logic that operates one level above all of this logically. It uses the primitives defines in the files above.

Note that since we cannot use ostreams in signal handlers, I created a new crash log in the logs directory that a crash will be written to upon a signal. The existing logs are all buried inside of ofstream objects.

@borrrden borrrden requested review from jianminzhao and snej March 20, 2026 23:10

[[noreturn]] static void sig_handler(int signo, siginfo_t* info, void* context) {
const char* name = strsignal(signo);
violation_type() = name ? name : "Signal " + to_string(signo);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, how did this get back in here.....this is not going to be allowed! I need to change this to use write_to_and_stderr


crash_handler_immediate(info, context);

signal(signo, SIG_DFL);
Copy link
Copy Markdown
Member Author

@borrrden borrrden Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to consider saving the original handler. It might not be SIG_DFL!

#include "betterassert.hh"

namespace fleece {
Backtrace::frameInfo getFrame(const void* addr, bool stack_top) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather this were a private static member of Backtrace than a standalone function.

Comment on lines +27 to +28
"_C_A_T_C_H____T_E_S_T_",
"Catch::TestInvokerAsFunction::invoke() const",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you checked that these names are still valid after we upgraded Catch2?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not. Is there a method you used to determine them in the first place?

}
}

char* unmangle(const char* function) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, could this be a private static member of Backtrace?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was originally not even before my PR. Making it a private static member requires bringing the other Unmangle methods along with it. Is that the outcome you want?

void* context = nullptr);

/// Captures a backtrace from the given starting point and returns a shared pointer to the instance.
[[nodiscard]] static std::shared_ptr<Backtrace> capture(void* from, unsigned maxFrames = 50,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is from, here? I'm assuming a stack frame or PC address? This seems like too low-level a method to make public.

[[nodiscard]] static std::shared_ptr<Backtrace> capture(void* from, unsigned maxFrames = 50,
void* context = nullptr);

static int raw_capture(void** buffer, int max, void* context = nullptr);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, is there a need to make this public?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs to be callable from the signal handler, as well as from Backtrace itself.

@borrrden borrrden requested a review from snej March 28, 2026 00:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants