Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NaCl amd64 crash dumps no longer work on Linux #1504

Open
slipher opened this issue Jan 17, 2025 · 15 comments
Open

NaCl amd64 crash dumps no longer work on Linux #1504

slipher opened this issue Jan 17, 2025 · 15 comments
Labels
A-VirtualMachine virtualization of gamelogic OS-Linux T-Bug

Comments

@slipher
Copy link
Member

slipher commented Jan 17, 2025

With 0.55.2, the command daemonded +map chasm +delay 50f sgame.injectFault intdiv fails to produce a dump in crashdump/ in the homepath.

It still works on Windows.

@illwieckz
Copy link
Member

illwieckz commented Jan 17, 2025

Here are the results of my tests on Linux:

amd64 i686 armhf
pnacl ❌️ ✅️ ✅️
saigo ❌️ ✅️ ✅️

So, only linux-amd64 is affected.

@slipher
Copy link
Member Author

slipher commented Jan 19, 2025

It appears that the NaCl exception handling method is broken in general. I tried the following patch:

diff --git a/src/common/System.cpp b/src/common/System.cpp
index 6e598f2ff..843d2681c 100644
--- a/src/common/System.cpp
+++ b/src/common/System.cpp
@@ -335,12 +335,15 @@ static void CrashHandler(const void* data, size_t n)
     Sys::Error("Crashed with NaCl exception");
 }

+void Crash(struct NaClExceptionContext *)
+{
+       Log::Warn("nacl crash!");
+       _exit(99);
+}
+
 void SetupCrashHandler()
 {
-#if !defined(__saigo__)
-    nacl_minidump_register_crash_handler();
-    nacl_minidump_set_callback(CrashHandler);
-#endif
+        nacl_exception_set_handler(Crash);
 }
 #else
 NORETURN static void CrashHandler(int sig)

Testing like ./daemonded -set vm.sgame.type 1 +map vega +delay 50f sgame.injectFault intdiv, the nacl crash! message appears on Windows, but not on Linux. All testing is with PNaCl.

P.S. for testing crashes on older Unvanquished versions with no injectFault, these behavior tree snippets are nice.

@slipher
Copy link
Member Author

slipher commented Jan 20, 2025

I wondered whether it ever worked at all and had to dig deep for the proof. Finally, I found 3 NaCl crash dumps from when I was developing the original Breakpad PR on an Ubuntu VM snapshot from 2015. And voila, Unvanquished 0.47.0 (the first release to include crash dumps) successfully produces a dump.

Also I tried old Debian in Docker which resulted in old glibc with new kernel. This didn't work so maybe it's the kernel that broke compatibility with something.

  • Ubuntu Wily: works. glibc 2.21, kernel 4.2.0, Unvanquished 0.47.0
  • Ubuntu Bionic: broken. glibc 2.27, kernel 4.15.0. Forgot which Unvanquished version I used
  • Debian Squeeze: libc too old to run Unvanquished 0.47.0
  • Debian Jessie: glibc 2.19 with kernel 5.15.146.1. broken with Unvanquished 0.50.0.

@illwieckz
Copy link
Member

illwieckz commented Jan 20, 2025

Interesting!

At least because it's already broken with PNaCl-built games this doesn't prevent to migrate to Saigo, and since other platforms produce crashdumps with Saigo-built games, fixing crashdumps for amd64 may fix them for both PNaCl and Saigo built games.

@slipher
Copy link
Member Author

slipher commented Jan 23, 2025

@illwieckz
Copy link
Member

illwieckz commented Jan 23, 2025

Thanks for the testing! So maybe not our fault, even no NaCl fault?

@illwieckz illwieckz changed the title NaCl crash dumps no longer work on Linux NaCl amd64 crash dumps no longer work on Linux Jan 23, 2025
@illwieckz
Copy link
Member

I edited the title to reflect it affects only Linux amd64 as far as we know.

@slipher
Copy link
Member Author

slipher commented Jan 23, 2025

So maybe not our fault, even no NaCl fault?

Perhaps a bit their fault for relying on a rather undocumented and untested feature, namely modifying the ucontext_t argument in a signal handler in order to return from the handler at a different location. The only information I could find about such a feature was a tutorial for Solaris on SPARC (Listing 9).

@slipher
Copy link
Member Author

slipher commented Feb 19, 2025

The signal handler modifying ucontext to return to a different location is at least documented and tested in the Linux kernel tree. https://github.com/torvalds/linux/blob/master/tools/testing/selftests/x86/sigreturn.c

The problem was reported to Chromium's bug tracker: https://issuetracker.google.com/issues/40643627
And supposedly fixed here: https://chromium.googlesource.com/native_client/src/native_client/+/303fc9961cb4231aa9828218362914ee4e51d16a

So we just need to update our NaCl runtime. Not sure if there are Google builds or if we have to build it.

@illwieckz
Copy link
Member

Does it work with this one?

@slipher
Copy link
Member Author

slipher commented Feb 19, 2025

Yeah, I extracted nacl_loader and it appears to work.

]/cgame.injectFault segfault 
Wrote crash dump to /home/slipher/.local/share/unvanquished/crashdump/crash-nacl-CGame-1739996787002.dmp 
Warn: CGame VM: Crashed with NaCl exception 

@slipher slipher added the A-VirtualMachine virtualization of gamelogic label Feb 19, 2025
@illwieckz
Copy link
Member

OK, that's one I just built on my system, even if my system is recent (Ubuntu 24.04 Noble) it also works on Debian 10 Buster, which is the distribution on which we build our engine releases, so that should be fine.

@DolceTriade
Copy link
Contributor

If we have a conclusion to this, does this mean we can merge #1501

@slipher
Copy link
Member Author

slipher commented Feb 20, 2025

If we have a conclusion to this, does this mean we can merge #1501

That's unrelated. The problem of this issue (#1504) is that the NaCl exception handling/crash dump is broken on the Linux amd64 platform. The toolchain (PNaCl/Saigo) used to build the binaries has no bearing.

The problem relevant to #1501 is that the Breakpad tooling does not work with binaries built by Saigo. As long as we can't symbolize a Saigo-built binary, it's useless to produce crash dumps.

@DolceTriade
Copy link
Contributor

Ah.. I see. I thought they were related

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-VirtualMachine virtualization of gamelogic OS-Linux T-Bug
Projects
None yet
Development

No branches or pull requests

3 participants