> The model was powerful, but also mentally heavy
No it isn't! it is this interpretation that kills off the null-safety debate entirely. Saying you have a variable that cannot be null is not a mentally taxing distinction, especially since everything is labelled thoroughly.
> The team, faithful to the lesson “simplify the model for the user, even at the cost of the performance ceiling,” ultimately dismantled this dualism.
but it would have simplified it for the user.
The whole attitude and process around this and the other topics gives me very little faith that Java can be steered in a sensible direction here. The type system of a programming language is supposed to give convenient guarantees to the developer on a CPU that can only do numbers. There is no reason to reduce the optional(!) safety guarantees you can offer with the excuse of "too mentally taxing".
Hell, they even get there half way by recognising:
> the language model and the JVM model don’t have to overlap one hundred percent
I agree. The stewardship of Java seems rather lacking - particularly when compared to that of .net, where MS etc. mostly seemed to make the correct decisions from the start.
Does Java even have any value or mindshare at Oracle nowadays? The company seems to be a datacentre/compute business at this point, with appendiges for its legacy activities and a vast overhang of debt.
I sometimes wonder if the only parts of Oracle that are still profitable are the Legal and Lawnmower divisions.
Now, as a member of the Java team (although I'm not directly involved in Valhalla), I'm obviously biased so let me just say that both designers and fans of programming language features would do well to remember two things:
1. Opinions about features are almost never universal, even among experts, and almost each of them is about a tradeoff where different people prefer different sides. It is rare that some scientific study settles the issue.
2. These preferences are often not evenly split. Even when both sides are equally confident that their preference is the right one, sometimes 80% or 90% of programmers share a preference. The people with the strongest opinions are more often than not in the minority, because most programmers don't think so much about the programming language (nor, I would say, should they).
All of the language differences between .NET and Java fall in this "non-consensus" zone, and at least in one area I was deeply involved with, virtual thread, I can say that we thought that whatever we do we mustn't do what .NET did and that what they chose didn't work out well for them at all.
"All differences are opinions....except what .net did. Those are wrong!"
I personally think Java continues to to waste a lot of time and come to a slightly more verbose and worse solution again and again.
Structured concurrency is like c# async/await with less sugar. Streams are just LINQ but worse.
It's probably exactly because of the "not like .NET! We must be different to explain being late" mentality.
Curious if you think fibers vs async/await is still in this zone (amongst experts). It seems fibers are objectively better. But I'm no expert*
I was at a conference on scientific programming in Java very early on that Geoff Fox put on up at Syracuse and we had a list of requests from Sun that they didn't give us but Microsoft gave many of them right away.
On the other hand I really like Java's all-virtual approach to inheritance because the .NET model gives programmers more ways to screw up and get confused.
Both languages slipped in generics after 1.0. Java used type erasure in a way that made it so a List<String> is really a List so generics could be retrofitted easily to existing code. .NET's implementation of generics let you do more but caused a rift in the ecosystem between generic and non-generic collections.
I'd say long term Oracle's stewardship of Java has been very good. JDK 8 puts lambdas on your fingertips with a very fluent syntax that belies the idea that Java is terribly verbose. Since then Java has gotten steadily better release after release while maintaining great compatibility.
I work with people who are conservative about updates because they are worried about breaking things but for the last few LTS releases I've said "it ought to be really easy, let's give it a try" and it is really easy and we get performance improvements we can feel.
In what way? If anything Java's main developers (employed by Oracle for the most part, working on the completely open source and free OpenJDK) are extremely knowledgeable and are responsible a big jump in how fast the platform evolves. They have added proper algebraic data types to the language, delivered virtual threads and garbage collectors that decouple pause times from heap size. Like if anything, Java is at the best place it has ever been.
No they haven't. E.g. they added a class that superficially looks like Option but subtly breaks the rules that Option is meant to follow, ensuring that no-one can ever manage to migrate existing codebases away from using `null`.
It is all about having AI on the framework, Aspire, multiple Web and Desktop frameworks all over the landscape.
Those interceptors and inline arrays via attributes instead of proper language grammar aren't that great either.
Yeah. Even when they add new grammar nowadays, it's always just something that trivially sugars away into previous grammar (see: records, `with` clones, extension properties, required, etc).
The moment they need something that it's slightly more complex... Out of scope. Even when it's completely necessary for the thing to be useful in practice.
For example, they added `required`, `record`s and property initializers, giving us good reasons to write `new Foo { A = a, B = b }` instead of `new Foo(a, b)`. A and B must be positive, so you'd write:
public required int A { get; init => field = value > 0 ? value : throw ... ; }
public required int B { get; init => field = value > 0 ? value : throw ... ; }
This is pretty standard C# code that you might see in an example for records.But then the requirements change: A and B must be positive, or they must both be zero at the same time.
This cannot be expressed at all with initializers. You simply cannot add code that runs after all initializers are called. You're stuck chasing every single initialization of Foo and using a constructor or factory method instead. Shipped it as a public API? Too bad. Should have seen it coming!
The new features are filled with this sort of thing. As if Microsoft never used them beyond the most basic examples. Or maybe they did, and explicitly chose not to fix it and solve later.
Part of the reason for that is that Java is older. https://en.wikipedia.org/wiki/C_Sharp_(programming_language)...:
“In interviews and technical papers, he has stated that flaws in most major programming languages (e.g. C++, Java, Delphi, and Smalltalk) drove the fundamentals of the Common Language Runtime (CLR), which, in turn, drove the design of the C# language.”
Also, some of Java’s design warts may be there because Java was initially envisioned for much smaller devices.
Second mover advantage.
Wut? I did worked on .net projects and all it achieved was making me like java a lot more then previously.
To me it felt a bit less like a religion and more like a language. It didn't force me to do things a particular way, quite as much. (Still more than I would have liked, though! After all, it's called that[0] for a reason :)
[0] https://www.reddit.com/r/ProgrammerHumor/comments/ddc4b0/mic...
-Java always has an API, .NET is about extending an existing application (Servlet API vs IIS)
-Java has a nicer IO as .NET has bidirectional streams (You can't wrap streams in .NET).
-Linq is nice but has a huge caveat: if a Linq provider does not implement it fully to falls back to the .NET collections. So trying to 'Skip' and 'Take' on a ActiveDirectory will fall back to collections in memory and cause a crash on a huge AD in production (Yes had the pleasure).
-Java's Eco-system is way bigger.also, null markers are coming too: https://openjdk.org/jeps/8303099
Its just that they have to deliver things incrementally. This PR that introduces value classes/objects is already 200k lines long.
I think you've missed what this is referring to. It isn't about null safety (which is orthogonal) but about having reference/value projections analogous to Integer/int.
What the Valhalla team ended up doing is, instead of having two projections for each type, one with identity and one without, value types never have identity and so Integer and int are synonymous, and the memory layout is determined automatically based on context and optimisation decisions. This is why the semantics of == for the primitive wrappers (like Integer) were changed, as they now don't depend on whether the "reference projection" or the "value projection" is used.
> There is no reason to reduce the optional(!) safety guarantees you can offer with the excuse of "too mentally taxing".
This is not what happened here.
Except they're not, as I can do Integer x = null, but not int x = null. So an Integer is forced to occupy more memory, for very very unclear reasons. And this is also deeply weird - there is no other (mainstream?) language that allows null value types.
That goal is an ideal and can't be reached perfectly. Converting a type to a value type will break clients that synchronize on them, or rely on identity for some reason. But such cases are rare, and can be weighed up on an individual basis when making the decision about whether to do it. Storing things in a nullable variable on the other hand is very common and changing the rules to prevent it would make every such change a source incompatible breaking change.
If you have language-wars about a concept going in and out of existence, that is a hint that there is demand and the language does not properly handle the demand or when it handles it, it creates mental overload.
> Value
> Errorstates
> Null
> IoExceptions
> WeirdOsStatesNeededToHandleUpstairs
https://fsharpforfunandprofit.com/rop/As the pythons said: Get on with it!
That said, we've been gnawing on this limb for a while...
This takes longer than game of thrones books
But a huge mistake (IMHO) was not having nullability part of the type system. You can still do this with type erasure.
Anyway, I read your comment as "nullability isn't complex" (paraphrased) but that's not the author's point. What's complex is having a value class and a regular class of every class and you don't necessary know which one you're dealing with at the language level.
C++ is a great example of this. You can create an object ont he stack or the heap and that's really what we're talking about with that proposal. And that's a nightmare. Combined with pointers it meant you never knew if you could free something or not and that ownership had to be passed around with vague comments like "// retains ownership".
Anyway, the whole article is a great tale of how difficult it is to retrofit things later and how difficult it can be to fix mistakes later (eg java.util.Date).
This is often given as the defensible reason, but it's not even that true. Java 1.5/5 had several "breaking" changes in it regardless including the newly reserved 'enum' and a whole freaking memory model update.
And besides if any of your dependencies updated for all practical purposes you did, too, since you had to use a newer runtime to run their code regardless, it never really made sense to keep using an older javac out of spite
Now, one can argue that this is just smoke and mirrors with type erasure and it is but you can already put a Date into a List<Point> if you're so inclined because the JVM doesn't know the difference, hence type erasure. So this is no different.
I'm no JVM expert but from reading the article it seems like the chosen solution for value classes is to treat them all as a single L-type in the JVM where each primitive type is its own L-type. If I read the correctly, it means that if you have a Point value class then on the JVM level you'll be able to stuff any value class into there if you're so incline, just like with List<Point>.
Obviously we need to be concerned with fuzzing (moreso in C++) but here really we're just trying to have sensible defaults that aren't guaranteed because we can't design the language how we want from the ground up without making a new language.
Oh and there is a prosopal for this [2]. Personally, I prefer the Hack version.
This seems heavier? Having two representations and manually having to refer to .val or .ref?
You can argue that the extra flexibility lets you write safer (non-nullable) code but naively it seems more complex at the language level.
What? It’s been getting better with each release. Valhalla brings features that address key problems, and they didn’t rush to it either.
Saying the mental model is too hard is basically saying your userbase is stupid. This stuff is not tricky.
How much was this article proof-read? Didn't they just get finished talking about how heap flattening won't work for objects with > 64-bit representations? Their `Point` is at least 65 bits (two 32-bit ints plus the null flag). The "plus a possible null flag" and oddly short following statements seem to suggest this was some AI that got sidetracked by trying to make emphatic statements... oh and also the "[IMAGE: the same Point[] array in two variants..." block halfway down the page is unfortunate.
that smells of AI [1], and thus lazy writing. I'm all in for using AI to help you write, but if you don't put your voice to it then there's no reason to read it.
[1] https://en.wikipedia.org/wiki/Wikipedia:Signs_of_AI_writing#...
Don't be all-in. It's important for humans to be able to write for themselves, and also to stand by what's been written in their name, which is much less likely if someone/something else has done the writing.
(proofreading is another matter though.)
As a proportion of all easily crawled text on the internet, a lot of it will be random marketing copy. That influenced the writing style of early AIs, and since then everyone has trained at least partially on transcripts from every other AI chatbot
It stopped being infuriatingly sloppy and took time to ensure the article had integrity.
It did having said that I did burn through a lot of tokens trying to do a deep analysis cross data pipeline debug.
Please, if you write a technical blog, or anything really: Stop. Stop letting the AI write for you. Nobody wants to read this.
TIL that Rust has NonZeroU64 which you can combine with Optional to get the required behaviour with only 64 bits per entry. [1]
Also, apparently, shifting a negative number to the left is UB in C.
Luckily we have Valhalla, which is an admission that Gosling was partially wrong, and programmers who want to have an unsigned nullable non-zero 64-bit integral value type can just make one, and not have to pay outsized memory costs to do so.
If we're done paying homage to Gosling, can we get operator overloading for our fancy value types please? I have no idea if this is on the radar for Valhalla.
> On June 15, Oracle engineer Lois Foltan confirmed what a good chunk of the industry had stopped believing: JEP 401: Value Classes and Objects will be integrated into the main OpenJDK repository and is targeting JDK 28.
> The change is so large that the remaining committers were asked to hold off on bigger commits during the integration. The pull request alone adds over 197 thousand lines of code across 1,816 files.
What in those paragraphs is obviously AI?
« The pull request alone adds over 197 thousand lines of code across 1,816 files. »
I noticed that both Claude and GPT are fond of those kind of stupid accounting statements that don’t mean a lot in and of themselves, but look impressive in a « wow numbers » way. Which is kind of ironic since counting remains one of their weak points
People care about provenance a lot.
Whether it’s a drawing my daughter did of her mother, a Picasso napkin sketch, a worn 1960s Stratocaster, or an blog essay, the provenance is value on top of the correctness of the item.
But you don't know which parts of it are true.
In this initial commit. As was made clear in the JEP, this is just the first deliverable of a huge feature that, like all Java features in recent years, is being delivered piecemeal. Obviously, the point is to flatten larger values (the mechanism is already in the JVM; what remains is exposing the intent of "I allow tearing" in the language).
Looks like they just missed the `!`. It should be `Point![]`.
Is there a way we can request a "flag as AI garbage" downvote for articles? Or should we just flag them?
It adds a fair bit to that "did anyone proof-read this? pretty obviously no?" vibe.
So == for value classes will basically be like memcmp(). That is a bit unfortunate, as it breaks encapsulation, exposing implementation details. Client code can use this to do case distinctions based on how a given value is internally represented. In a way, it’s worse than identity comparison, because identity comparison at least doesn’t expose internal state.
To see why, consider that to do any useful work, data from different objects (also from different types) has to be combined. To be able to do that in the OOP framework, the encapsulation has to be unwrapped. That's why Java code is littered with getters and setters that don't do any useful work at all, they just make it too painful to get any real work done.
Again, there is a place for objects and implementation hiding, but it's at the highest levels of an architecture where different components get integrated.
It should work even for strings: They will surely continue to be heap-allocated, and memcmp-ing pointers (inside the new "structs") is exactly an identity comparison.
For example, you might have a value class for representing (limited-precision) fractions using two longs internally, for the numerator and denominator. For efficiency trade-off reasons, you don’t want to always shorten the fraction. But now client code can distinguish 2/3 from 4/6 using ==.
Scenarios of that sort are conceivable where this actually leaks sensitive information. In any case, it creates dependencies on implementation details where you don’t want to have them.
When designing a value class, you are now in the dilemma of either always having to normalize the representation, costing performance, or having your class be a funnel for leaking implementation details.
There is a lot wrong with that: complexity, bloat, and slowness.
> But now client code can distinguish 2/3 from 4/6 using ==
That's a great way to obfuscate code. Not a good idea. The right way to do the comparison is, just make a function called CompareRational().
Java separates checking identity and equality for objects. == basically checks if two pointers are the same. Equality is a subjective concept based on an interface (ie equals/hashCode). So this means:
new Integer(1000) == new Integer(1000) // true, used to be false
new Integer(1000).equals(new Integer(1000)) // true
new Integer(10) == new Long(10) // compiler error, used to false
new Integer(10) == new Integer(10) // true
There's a lot going on here. The complication is that in previous versions of Java (and I'm not sure when this changed), integers below a certain value would be replaced with canonical types below a certain value. I think it was 128 but its's been awhile. This led to the difference between 10 and 1000. That's now changed, I suspect because the above comparisons are being implicitly unboxed. That didn't used to happen either. I saw this because the Integer/Long comparison used to return false and it's now a compiler error so there must be unboxing going on.You may still be able to get the old behavior through variables too.
Anyway, if value classes lose identity then == changes from pointer equality to bitwise equality. That will hopefully resolve a bunch of corner cases like this but it is a breaking change, technically.
new Integer(10) == new Integer(10) // true
Before value classes this would always be false. The only time comparing Integer objects with == could be true is if Integer object was create by going through Integer.valueOf (or obviously if they were the same object reference.) By default the cached values where -127 to 127, but that is tuneable at runtime.https://github.com/openjdk/jdk/blob/jdk-27%2B27/src/java.bas...
> By default, Java maintains a cache of Integer objects for values between -128 and +127.
[1]: https://stackoverflow.com/questions/3130311/weird-integer-bo...
[2]: https://dev.to/marzuk16/understanding-integer-caching-in-jav...
wait, really? I thought introducing _such_ incompatibility was not allowed
If Java was a child, imagine it being brought up by loving parents for the first few years (Sun) then it was thrown in a garage with some other children and neglected by its evil guardian (Oracle)
Neglected and unloved till JDK 8, its basically been playing catch up.
So when people say "oh so its now got structs or value types of X", yes it has but that's because it has been stunted in its development due to big bureaucratic and hostile corporate processes, but its free now and is getting love through the OpenJDK family.
I will continue to enjoy writing once and deploying anywhere!
Whether you like oracle or not, this is simply not a correct description of Java's history. It was brought up by loving parents, who due to financial problems had to put Java into a foster home where she was neglected.
But later it was adopted by new, loving parents (Oracle) and she bloomed and become a healthy and stable adult.
Like, it was Oracle that completed the open-sourcing of the platform, making OpenJDK the reference implementation. They also open-sourced the previously proprietary jfr, mission control etc tools.
They also managed to keep many of the original members of the language team, which is quite rare during these acquisitions, and Java has seen a huge improvement both on the language and runtime front.
The Java team has been delivering nice language and environment improvements regularly since Java 10.
Same with MySQL, btw. "Dead" according to this site, risen from the dead under Oracle for those who actually know it.
In fact, much of the software industry, which writes the software that matters to our lives the most and holds most of the value delivered by software in general - the software that processes your credit-card transactions, runs your bank, sorts your mail, routes your phone calls, manages the manufacturing of your car and the shipping of your packages, holds your healthcare information, schedules and tracks your flights, and manages your law enforcement and your government - is barely represented here because the organisations that write most software aren't software companies, and they don't tend to publish technical blogs.
> Neglected and unloved till JDK 8, its basically been playing catch up.
These two statements are contradictory. The last Java version under Sun was in 2006. Oracle bought Sun in 2010. JDK 7 came out in 2011 and JDK 8 in 2014.
The team largely remained the same, and the main difference was that Oracle ended the neglect and funded us more, which is why Java picked up the pace after the acquisition.
> its basically been playing catch up.
Catch up with who or what? There are only two languages in the world as popular as Java or more: JS/TS, and Python. People who are saying Java is "playing catch up" usually compare it to languages that are doing far, far worse than Java. It's just that people who like certain features think that the language that has them is doing poorly despite them and not because of them. Many times I see people insist that other languages are "doing it right" (or better than Java) even though it is clear that the people who say this are in the minority when it comes to preferred features.
> So when people say "oh so its now got structs or value types of X", yes it has but that's because it has been stunted in its development due to big bureaucratic and hostile corporate processes, but its free now and is getting love through the OpenJDK family.
If anything, the opposite is the case. Managers love to see things ship quickly. It is our technical leadership - all people who were there in the Sun days - who insist we have to move deliberately and carefully and get things right. You can agree or disagree with the decisions, but comparing Java unfavourably to languages that are doing far worse is unconvincing.
Rather, what I think the vibe is because Java is not as popular as it was in, say, 2003. And it certainly isn't. But guess what? No other language is, either, because that time was anomalous not only for Java, but for the entire software ecosystem, which had never been as consolidated and unfragmented before or since.
Except to the browser, iOS, embedded systems...
WebAssembly is the real write once deploy anywhere tech now. JVM had its turn and lost.
Serious question: I remember the old installer, six billion devices or whatever. I’ve heard about Java ME, old set-top boxes and DVD players, etc.
But how much of that is active today. I can’t say I’ve ever seen a job listing for an embedded Java developer or even Java ME in my entire career. Are people actually still using it?
Anyway, I wouldn't even call Java "stunted". It made choices, some reasonable, some not, and those are incredibly hard to fix later. Heck, just look at C++. Semi-compatibility with C is (IMHO) an unfixable 150 foot albatross around its neck and so many versions from C++11 onwards have simply been about making that 150 foot albatross more bearable.
I personally think treating all value classes as a single L-type in the JVM (like primitive types, basically) is a fairly neat solution to a difficult problem. But all this comes down to the original Java 2 decision to implement generics as type erasure to maintain backwards-compatibility, something that C3 NOPEd out of as a result.
Our work uses modern Java (26 w/ preview features - mainly for StructuredConcurrency), and it's fantastic. Do not regret it one bit, and that's coming from using both Haskell and Python at previous companies.
There's something else amiss here. Compared to other platforms, upgrading Java, even on complex codebases, has never been a nightmare for me.
Good news is, Oracle extended extended extended support for Java 8 will not last forever, and eventually - if you work in a regulated industry - the company WILL have to pull the trigger.
On the other hand, "where there is muck, there is brass", so a little bit of legacy can be beneficial for some.
> In 1995, a memory access cost roughly the same as a CPU operation
Uhm... no?!
Here's a CS paper from 1993(!) about prefetching from cache(!!) because the cache was slower than the ALU. https://www.eecs.umich.edu/techreports/cse/93/CSE-TR-152-93....
It would perhaps make Java look a little bad to say that, in 1995, the prevailing attitude in certain circles was "If it's too slow, just wait for faster hardware - Moore's Law forever baby!" (Of course, Sun was selling, at the time, relatively fast hardware - the slower the software, the faster the required hardware)
The Z80 took 3 cycles to load from memory. A register to register transfer took 4 cycles (including fetching the instruction). Only one of those cycles was instruction execution.
I think the only reasonably mainstream scenario where the CPU would be significantly slower than memory would be the serial CPU designs such as the PDP-8/s.
That said, at the time people were doing cool stuff with 8-bit CPUs, they weren't running software remotely like what we're discussing here. That would have been done on a VAX, which had instruction and data caches.
What really happened, that the article is alluding to is that memory didn't get much faster in absolute terms since the 1980s. CPUs on the other hand did.
E.g. in the 1980s we had 60ns DRAM. Today DDR5 I believe allows about 10ns random access reads best case (6X). Over the same period CPU clock speeds have increased from about 8MHz to 5GHz (600X).
> "The defining trait: no identity"
I get that this makes objects behave like primitive types. Maybe thats reason enough. But is it necessary for the performance boost and de-fluffing the objects? Seems like an orthogonal objective
> There’s a catch worth knowing about here, though: flattened data has to be readable and writable atomically (otherwise it risks “tearing” under concurrent access).
Isn't this a race condition and "undefined bahvior"..? Having to limit yourself to atomic sizes seems like a huge limitation, to accomodate what is most likely buggy code. Is all the effort only gunna help lil toy ColorRGB examples?
> The points array is a million pointers. Each pointer leads to a separate Point object lying somewhere on the heap.
Does this happen in actuality? One would assume the allocator tries to put stuff sequentially on the heap? Its not a guarantee as with these Value Types, but I'd think you could get similar-ish perf with prefetching in cache. I dunno whats happening under the hood.. But when writing Clojure apps the JVM always reserves absurd amounts of heapspace on my machine (to my annoyance). Id assume it can find some place to do contiguous allocations..
Which i guess gets me to my last question... where are the benchmarks broski? It all sounds great, but does it actually yield the insane speedups promised?
Great article, well written. But a benchmark would have been a nice "punchline"
Yes. The one part of the JVM GC that can't run concurrently is heap compaction; objects that can be moved by copying and then deleting would be a huge help for that. And it would be awkward to say the object has an identity but can't be wait/notify'd, at which point you need somewhere for the monitor to go.
> Does this happen in actuality? One would assume the allocator tries to put stuff sequentially on the heap?
Yes. Of course it tries, but semantically the pointers are just pointers and the prefetcher can guess but the system still has to chase them.
It feels like an orthogonal objective and honestly arbitrary distinction, yes.
> Isn't this a race condition and "undefined bahvior"..? Having to limit yourself to atomic sizes seems like a huge limitation, to accomodate what is most likely buggy code.
I think they meant it like the appearance of atomic behavior from a java multithreading view.
> Does this happen in actuality?
Yes, it does happen. Having guarantees on this front leads to better performance.
> But when writing Clojure apps the JVM always reserves absurd amounts of heapspace on my machine (to my annoyance)
Might be a configuration problem?
Arguably flattening mostly makes sense for these only.
And yeah, you are right that allocations happen on something called a thread local allocation buffer, which is basically just a pointer bump in cost and objects allocated one after the other should be physically close in memory for the most part (though an object's creation may require a bunch of other object's creation that would sit in-between). But these have headers, so not as dense as they could be (though due to GCs being generational, they may end up actually closer in the next gen? The in-between temporary objects wouldn't survive for the most part)
The current code will help with `Integer[]`, `Char []`, etc, as well as combinations of `byte`, `char`, and `int`. Past that it doesn't really help much.
It would be fantastic if we could also flatten something like `Pair` or `Tuple`. However, even with compressed pointers, that is 64 bits, so that, plus the `null` bit, means it can't be flattened, which is a real shame. For various reasons, I have `List<Long>` in numerous places in my code, It would be great if that could also be flattened. However, since a Long is 64 bits, it _also_ can't be flattened. https://openjdk.org/jeps/8316779 would go a long way to to helping here, since then at least the null bit could be thrown away, which would allow more things to be flattened.
And then, if you want to go Wishlist land, something that would allow SSO (Small String Optimisation) would also be awesome, but that would require something akin to unions in Java, which we can _kind_ of do with sealed classes, but, since String is a final class, can't be retrofitted back into the language.
Does anyone know if Valhalla will flatten "simple" sealed classes, where every sealed class is small enough to be flattened? Since that would also be a powerful example to share.
In the current setup will a Pair Value Type be a compiler error, or will it silently just have bad perf?
Sad. Hope they can do this by the next LTS JDK.
Given that the JVM could already do escape analysis and allocate regular classes on the stack in certain scenarios, it's very unclear what benefit, if any, this will bring for normal processors for anything except the base wrapper types - even after implementing generic support and nullability for value types in a future JVM.
It’s when they land next part (nullability) it will shine fully - particularly on the intersection of not null and value. Alternatively if they introduce tearable semantics it will also shine - it would be possible to still optimize array of value classes, even if they are nullable (for example by having correspondent nullability mask).
So they are taking right step in a right direction. They are just trying to land this incrementally.
I’ve been reading the mailing lists and watched all videos on the topic and it is truly inspiring how much they managed to consolidate the design to something that always looked like java.
But while also going far deeper in granularity and understanding what it even means to be a value type and what optimizations can be done where
Let's take a stroll down memory lane. First of all, .NET literally started as a Java copy. On top of it, a non-cross-platform one for almost two decades! After having shamed Linux for so long Microsoft finally started porting .NET to other platforms in a non-backward compatible way. A lot of .NET proponents will tell you porting from legacy .NET to .NET Core (which was renamed once again to .NET) would be a quick fix, but it isn't. For example, the shop I used to work in had some important cryptographic libraries which were very painful to port. And then, there's .NET's simplistic garbage collector, which can be quite annoying because it tries to be a one-fit-all solution that basically cannot be tweaked at all, often resulting in unresolvable latency problems. There’s a lot of other stuff, like its ghetto-like ecosystem and the insane fragmentation of GUI libraries.
I also don't get the C# praise. Over the years, it has become quite the bloated language. It feels like Microsoft tries to implement every feature possible without realizing that an enterprise language is supposed to be streamlined. Async/await? Very ugly, very annoying. Java has solved this a lot better with virtual threads and structured concurrency.
I could go on, but these "language wars" are silly and pointless. Both platforms have their pros and cons. Besides, I have a lot of bad things to say about the JVM as well, but it's nice to see Valhalla finally beocming reality. Too late for me personally though.
Like what?
legacy .NET to .NET Core (which was renamed once again to .NET)
It was always .NET, only that new one had 1 till 4 had additional "Core" to clarify any confusion that could come from having same numbers as old. here's .NET's simplistic garbage collector ... it tries to be a one-fit-all solution that basically cannot be tweaked at all
Definitely tweaking GC is not a thing in .NET land but it is far from "cannot be tweaked at all".Also, there is no exact analogue, C++ is just wholly different where you can specify copy/move/destruct semantics on a gradual basis.
Unless your company forces you to use Java for new projects, consider a change
I really hope they give an escape hatch for this. It will make it really hard to extract a lot of the benefit of valhala if you can't make a thread unsafe value class. It's also one of those problems that will be quite hard to run into. You basically need something like this
class Bar {
static Foo value[] = new Foo[10];
static void setFooFromManyThreads(Foo foo) {
value[0] = foo;
}
value record Foo(int x, int y, int z) {};
}
Not something you typically run into and generally already a thread safety problem.The solution is also simple, a `synchronized{}` block will fix it if you need to have a tearable class that's written from multiple threads.
But the other thing is that for SIMD operations, you really need flattening, and that really does typically mean having something like `Foo(double x, double y, double z)` in play. It'd be a shame if the way we have to do this is a struct of arrays.
You can assign the object again to overwrite it 'in place'.
> And a simple write-lock bit for fat Value Types would solve everything while maintaining most of the performance benefits (both on read and write)
They even already have an extra 'null' bit tacked on to the value object.
If I have a function that has a value `x` that erases to `java.lang.Object` (e.g. a parametric function with no lower bound); then it used to be safe to check for nullity and then synchronize on the object.
This is no longer safe: This can now throw `IdentityException` into your face. (it was _never_ a good idea)
In other words, a lot of old code must be reviewed.
I suspect that `-XX:DiagnoseSyncOnValueBasedClasses=2` will need to stay (with the semantics: if user tries to synchronize on identity-less object, then log a JFR event and make it a NOP, don't throw an exception)!
The current JEP text is a little too ambiguous to figure out whether that is the plan, anyways.
if you really want a fun drawing get a human artist to do it. it doesn't need to be complicated, for example https://www.code-cartoons.com/ is mostly just stick figures and does an excellent job
but you don't even need any of that, a mermaid diagram would have worked perfectly fine too. instead you chose to use a technology that is known to be harmful
If you don't have the time or put in the effort to make your article, I'm not going to spend time and effort reading it. You really don't need some generic cartoon guy hovering over your graphs, draw them in MS paint or something.
They want basically to solve the main Java design flaw with (almost) everything is a reference paradigm. C++ and Rust have had value-types from day one.
> 64 bits, including the null flag
So, this basically makes every value-object optional, adds extra overhead and makes code less safe to null pointer dereference errors.
> but a class with, say, two int fields or one double may not fit in an atomic write and end up as an ordinary object on the heap anyway
So, the whole optimization is applied only for very small structs with no more than two scalars (or so). Did it worth to spend 10+ years of development to achieve this?
Java JEPs are piecemeal, there is plenty other JEPs building on top.
top-level page: https://openjdk.org/projects/jdk/28/spec/
JEP status: https://bugs.openjdk.org/secure/Dashboard.jspa?selectPageId=...
I'd really like to see someone trace related developments in C#, Swift, Java, and Rust, since they all have been racing to catch up to hardware, and I believe they are cross-pollinating.
(My concern is how all this will affect the FFI memory shares.)
Not sure if it covers exactly the same terrain, but perusing the article, it seems to be the case, with a single instance being the degenerate case.
I've made something like this in the past. And I did it exactly because `List<Foo>` was too expensive and slow.
class FooSOA extends Collection<Foo> {
double x[];
double y[];
double z[];
Foo get(int index) { return new Foo(index); }
record Foo(int index) {
double x() { return FooSOA.this.x[i]; }
double y() { return FooSOA.this.y[i]; }
double z() { return FooSOA.this.z[i]; }
}
} 1 comment
Probably not. -XX:MaxRamPercentage=70
But they are working on removing that: https://openjdk.org/jeps/8377305Value types, generic specialization, boxing - a quick skim makes it looks like they picked the same choices.
The `Point[]` in the image tag of your LLM output crashed your image generation post processing.
If sane generics had been introduced, the non-generic collection types would be a historical oddity by now and we wouldn’t be talking about how a future enhancement is going to undo the erasure decision.
What will this code print:
Point a = new Point(10, 10);
Point b = a;
a.x = 100;
System.out.println(b.x);
Until now the answer was obvious. Now with the addition of value classes, the answer depends on whether Point is a value class or a reference class. So readability suffers with this design.This is a violation of the principle of uniformity. In The Psychology of Computer Programming, Weinberg explains that uniformity is a psychological principle which says that users/programmers expect that things that look similar should do similar things, and conversely that things that look different should do different things.
If a programming language lets two constructs look nearly identical at the use site while having meaningfully different semantics, it increases the cognitive burden on the reader. Programmers must inspect the type declaration or rely on tooling to understand whether assignment, equality, identity, and mutation behave like ordinary reference objects or like values. That can make code harder to reason about and maintain.
This could have been fixed by requiring the use of the "value" keyword not just at declaration time but also at use time like this:
value Point a = new Point(10, 10);
Point b = a;
a.x = 100;
System.out.println(b.x);If you want to change an element of such an array you need to create a new immutable struct which in practice it is quite fast, but a bit verbose to write.
Scalarization can fail in surprising ways just due to what a maximal atomic write can be on the target platform, and then it fall back to heap allocated objects.
Even if there's type erasure.
I much rather have the compiler balk at me than let me write something that may or may not work as expected.
That seems off. They're still objects, the new thing is that they can give up identity.
1. Can someone remind me why it was so important/intentional at the start of the language that every object has identity? 2. Why is it important that we not synchronize on these value objects?
Since when? I’m pretty sure structs didn’t have identity last time I used C#, and that would be a very surprising thing to add.
What is unclear to me is why the decision to use a Point instance as a value or as a reference is made in the class definition rather than by the caller.
> Point[] point = new Point[10];
For the same class, I might need an array of values in one place and an array of references elsewhere within the same codebase.
And that across 2819 commits.
Wow, that’s insane.
I do not think you can do stack allocation in Java.
Fun read.
I don’t know if this is fair way to try to disarm your critics. The only thing that’s remained after this decade is the slogan so it’s a real ship of Theseus question if Valhalla has shipped since what’s delivered doesn’t achieve it. Congrats on the accomplishment, but from looking at what ended up, I’m not sure it’s a huge improvement.
> The trouble is that this optimization is unpredictable and fragile.
Is this describing escape analysis or value classes? Because the list of exclusions where this does anything is so large and the conversion to a heap type under the hood is so transparent and opaque, I think it can describe this technique as well.
Also, the whole “works like an int” motto is violated - int is never null, int-> integer boxing is explicit and well understood.
> In the new model, the wrapper classes themselves become value classes (when preview is on, Integer, Long, Double, and company lose their identity
Oh neat, they sidestep that by changing the definition of an int. I’m sure it’ll be trivial to turn this on in the wild on code that may be relying on identity for boxed numerics. I think this alone shows this project can’t ever be turned on by default and now we’ll have a decade of two Java languages (one with value types and one without) as they try to convince everyone to migrate and then just turn it on (ie python3).
So much opportunity squandered and dismissing critics as always having something to complain about is a neat way to sidestep legitimate criticism that this approach is not going to work out for Java.