This isn’t just any old thread triggering SIGKILL, it’s the JIT thread privileged to write to executable pages that is performing illegal memory accesses. That’s typically a sign of a bug, and allowing a thread with write access to executable pages to continue executing after that is a security risk.
But I know of other language runtimes that take advantage of installing signal handlers for SIGBUS/SIGSEGV to detect when they overflow a page so they can allocate more memory, etc. This saves from having to do an explicit overflow check on every allocation. Those threads aren’t given privilege to write to executable memory, so they’re not seeing this issue…
So this sounds like a narrow design problem the JVM is facing with their JIT thread. This blog doesn’t explain why their JIT thread needs to make illegal memory accesses instead of an explicit check.
Because explicit checks on every memory access (pointer dereference) makes Java significantly slower, even with compiler optimisations to remove redundant checks[1]. Memory protection is a fundamental, very useful, hardware feature and it's perfectly reasonable for user space language runtimes to take advantage of it.
Or, to put it another way, SIGSEGV has been a part of Unix-family OSes for decades. It works perfectly fine on Linux and Windows and there's no reason it shouldn't work on macOS.
[1] (Many years ago I worked on a cross-platform implementation of the Java runtime and wrote much of the threads and signal handling code. We had an option to enable explicit memory checks, which got us up and running faster on new platforms where the SIGSEGV handlers hadn't been written yet. From memory this made everything something like 30-50% slower, so it was definitely worthwhile to implement SIGSEGV handling. In our case SIGSEGV handlers were used both as part of the garbage collector/memory management and to implement Java's NullPointerException)
This is like arguing to allow the guy who can't drive and just pin-balls his way down the freeway bouncing of other cars, because to prevent him from driving would be to take away his personal freedoms.
At least they could have provided a path back to the old behavior.
Or in apple vernacular, it should just work.
Where by “protected memory access signal mechanism”, they mean SIGBUS/SIGSEGV, i.e., a segfault.
This is probably because the JVM is doing “zero cost access checks”, which is where you do the moral equivalent of:
try {
writeToFile()
} catch(err) {
if (err == SYSTEM_CRASH_IMMINENT) {
changeFilePermissions()
retry
}
}
…because it’s faster than checking file permissions before every write. (It’s a common pattern in systems programming, so it’s not quite as crazy as it sounds.)I guess my opinion on this is that if you write your program to intentionally trigger and ignore kill(10) / kill(11) from the host OS, for the sake of a speed boost, you can’t really get too mad when the host OS gets fed up and starts sending kill(9) instead.
I also wonder what happens in the (extremely rare) case where the signal the JVM is trapping is a real segfault, and not an operating system signal.
I believe the "the truncation of memory mapped files" section is for when the Java process memory-maps a file (as Java provides memory-mapping operations in its standard library, and probably also uses them itself), and afterwards some other unrelated process truncates the file, resulting in the OS quietly making (parts of) the mappings inaccessible. Here the process couldn't even check the permissions before reading (never mind how utterly hilariously inefficient that would be, defeating the purpose of memory-mapping) as the mappings could change between the check and subsequent read anyway.
[0]: https://bugs.java.com/bugdatabase/view_bug?bug_id=8327860, "I've managed to narrow this down to this small reproducer:" section
> When a program violates the protections of a page, it gets a SIGBUS or SIGSEGV signal.
(The Linux man pages for mmap and mprotect indicates SIGSEGV would be signaled.)
So the past use and assumption (SIGSEGV or SIGBUS) are consistent with the expectations of mmap and mprotect given the documentation provided.
However, I still stand by my pseudocode - I claim that it will give a fairly accurate impression of the basic concept of zero-cost access checks to a reader who isn’t familiar with low-level systems programming. (That said, I have updated my comment to make it clear it’s more of a metaphor than a literal description.)
[0]: https://mostlynerdless.de/blog/2023/07/31/the-inner-workings...
Just an educated guess, but the JVM knows if a thread may expect a segfault at a given point or not. If no thread expects one, then I assume the segfault handler just writes out that a segfault happened with some useful info, and terminates the program. I mean, I’m sure about the effect as I have caused a JVM to segfault a couple of times with native memory, so it handles it as expected.
I wonder if Oracle really didn't know beforehand.
Apple has long been telling people (writing JITs) that to write to executable memory, they need the correct entitlements (com.apple.security.cs.allow-jit, allow-unsigned--executable-memory, and or/ .disable-executable-page-protection). I wonder if Oracle has been ignoring them, satisfied with the signal-handler workaround, and Apple finally enforced their policy.
Apple also expects that developers deploying apps on MacOS that use Java have these entitlements configured on a per-app basis. Oracle likely objects that this is not really for the application developer to certify, since it's pretty much out of their control.
In any case, I'm doubting Oracle's release is the whole truth.
As far as I understand, that’s not the issue, the JIT itself works just fine. The JVM just uses the (quite common) trick that it doesn’t actually bound check everything, but let’s the hardware trigger an interrupt, expecting that to “bubble up” to the program at hand, so it can handle certain cases “for free”. This behavior was changed by apple, which causes issues.
The whole truth is that the Apple kernel team broke user space.
Also amazing it wasn't caught during the beta period.
One can have any sort of semantic versioning that is not SemVer 2.0 compliant and still be useful, see e.g Rails or Ruby.
Even .Net assemblies are not SemVer 2.0 compliant: their pattern is maj.min.patch.build but SemVer 2.0 specifies that there can only be three conponents and build info must be behind a plus, like maj.min.patch+build
I'm just a lowly JavaScript/TypeScript/PHP programmer, but what is the Very Good Reason that Java trying to access other processes' memory?
In a typical modern operating system, a memory page can be non-writable and non-executable, writable and non-executable, or non-writable and executable, but not simultaneously writable AND executable.
If you generate executable code at runtime, then you need write access to a page to write the executable code into that page. Then you need to tell the operating system to change the page from writable to executable.
If you then try to write to the page, you’ll get a signal (SIGSEGV or SIGBUS, according to the article).
Oracle’s JVM apparently relies on this behavior: a Java process sometimes tries to write to a page (in its own memory space) that is not marked writable. The JVM then catches the SIGSEGV and recovers (perhaps by asking the operating system to change the page back from executable to writable, or by arranging to write to a different page, or to abort the write operation altogether).
Basically what its used for is to implement an 'if' that's super fast on the most likely path but super slow on the less likely path.
It's not super clear what its being used for (this is often used for the GC but the fact that graal isn't affected means that likely still works). Possibly they are using this to detect attempts to use inline-cache entries that have been deleted.
It's also pretty common to use memory protection to autoextend stacks... Allocate the stack size you need, ask the OS to mark the page(s) after the stack as protected, catch the signal when you hit the protection, allocate some more stack and a new protected page unless the stack is too big. Works for heaps too.
Let the MMU hardware check accesses, so you don't have to check everything in software all the time.
A fairly common idiom is to use memory protection to provide zero cost access checks, as you can generally catch the signals produced by most memory faults, and then work out where things went wrong and convert the memory access error into a catchable exception, or to lazily construct data structures or code.
So you want the trap, but the trap itself can be handled. It sounds like there’s been a semantic change when the trap occurs for execution of an address or an access to an executable page.
There are also a bunch of poorly documented Mac APIs to inform the memory manager and linker about JIT regions and I wonder if it’s related to those. It really depends on exactly what oracle’s jvm is trying to do, and what the subsequent cause of the fault is.
Certainly it’s a less than optimal failure though :-/
Accessing other processes' memory is not the concern since virtual memory provides each process the illusion of having the entire address space for itself.
Do not update until Apple fixes the issue.
Btw what sort of problems are you facing? I have had problems with closing figures, but figured it out eventually with a workaround [0].
[0] https://se.mathworks.com/matlabcentral/answers/2027964-matla...
Can you tell from this or any other Oracle bug whether Apple is bending its rules for Java? I can't tell either way.
I would be surprised if they do to be honest (Apple doesn't even catch obvious bugs in the new macOS settings panel, which really makes me wonder if there is a software QA process at all). For 3rd party apps they seem to rely on the software vendors to holler if a macOS update breaks their app. That's why the macOS prerelease versions exist. But since the bug wasn't present in the prerelease, affected vendors couldn't catch it. It's still a fuckup in Apple release process of course (which tbh also isn't surprising).
I really don't know what Apple would be 'warning' against. Don't use Java? There are tens of thousands of business and development tools depending on the JVM. Blocking Java would diminish the value of macOS tremendously and doing so without warning would open Apple up to lots of lawsuits.
Do you know how long it takes to reproduce? The OP was light on details here. I assume that a memory access issue with the JIT would pop up pretty quickly, though.
This also bothers me on Android. Sometimes, an app update may break something and prevent me from using it. But Google doesn't allow me to reinstall a previously published version from the Play Store. If I don't have to (or can't easily) do without that application until a fix might be released, my only option is to find an older release on some shady mirror site.
That said I’m curious what the exact scenario that leads to this is, I’m assuming it’s not common as you would expect it to have come up during betas and pre -release seeds.
The article specifically says that the issue was not present in early access releases, so it was not possible to discover it before the actual release.
:tripplefacepalm:
Somebody hire some engineers at Oracle.
I wonder if we’re about to enter 4-5 years of macOS “dark ages”, due to Apple grappling with EU/DMA.
Much like Microsoft in early 2000s, between IE/lawsuit and grappling with internet security/viruses. Windows XP, launched in 2001, was considered by most a great OS, didn’t have another good OS successor until 8-years later (Windows 7).
I think we already saw some of this in particular with the recent bullshit they tried to pull with PWAs in iOS 17.4 that they were hoping to just let things break and were hoping that they could shift the blame and anger towards the EU instead.
There was a HN post about a hashicorp founder using Linux within a vm on their mbp. Might adopt that same approach, if I can find the og post.
Worked great for years before I changed jobs that let me bring my own hardware finally.
It is really the client facing side of Windows that really sucks, (warning: explicitly strong language) such as having really shitty software known as Office, like god why Word and not Latex, and why spreadsheet when we have database that we can query efficiently? Or not being able to have multi-user RDP session due to Microsoft having licensing dispute with Citrix about 20-ish years ago (fuck you Citrix, you asshole!). Or why do I have to do a lot of hoops and install a lot of "C++ redistributable" for running some antique software? Or why do I have to jump through a lot of group policy simply to enable WinRM and get remote powershell management?
Either way, I'm typing this on a Windows 11 desktop with WSL2 on. The hybrid experience is incredible, unless you need some performance critical app (WSL2 is in general slower than bare metal Windows and bare metal Linux itself, of course, except in machine learning).
Things like 9P to cross the Window file system access also introduced a lot of pain such as permission control because Windows does not have a POSIX-like permission system, like instead of having a simple 2 bytes that split into 3 octal number (there is a reason it is maxed out at 777), you have an incredibly sophisticated, capability and token-based access control system dated almost 30 years ago that Linux doesn't even have back in the day! But that pile of shit is now full of bugs and exploits such as token/handle duplication. (oh yes I'm talking about black hat territory as I also do some red team CTF regarding these stuff)
An issue introduced by macOS 14.4, which causes Java process to terminate unexpectedly, is affecting all Java versions from Java 8 to the early access builds of JDK 22
If this affects so many versions of Java and nobody notices, is anyone even using Java on macOS?IntelliJ IDEA, the product itself, is JVM based.
And there's a known issue with an interaction between minecraft, Java, and the video drivers that crashes out and it can be traced back all the way to here: https://github.com/glfw/glfw/issues/1997
It's not fixed.
The JetBrains team has already figured it out as well.
Did you check the Console app for crash reports?
These behaviours have been historically well documented.