Hoping Java will get this one day, but probably not...
Combine those annotations with a linter + pipeline that marks nullability warnings as errors and you've come pretty close to Kotlin's advantages. Of course, Kotlin also has some more advanced mutability controls and other advantages that Java doesn't get for free.
When it comes to simple values, null vs non-null can be solved by using primitives (long) instead of objects (Long), as primitives can never be null.
You can mark the default state. I like to mark everything as NotNull, unless specified otherwise. That way only Nullable annotations are needed at the rare occasion null is a valid value.
And I believe it gives you the exact same guarantees as Kotlin, minus the syntactic sugar — nullability is one of the few things that can be statically analyzed.
Most linters also know the standard library’s nullability information, so it’s quite good.
[0] https://kotlinlang.org/docs/java-interop.html#null-safety-an...
Disagree. The Kotlin way of doing it leads to really subtle bugs in generic code, because T? is usually different from T but sometimes it's not. (For example, if you write a generic cache that caches the result of f(x) in a map, it's really easy to accidentally write code that doesn't cache the result if it's null, and not notice).
Also a lot of the time you don't actually want Optional, you want Either, because you want to know why the value wasn't present. Either is really limited in Kotlin.
1. They are not composable (can't map or flatmap or fold/reduce them).
2. They can only represent one extra value, if you need more, you are back to square one (eg. you can't return an error value, only the fact that there is no value).
If we make another step, one could argue that even optionals are lacking, one should model the possible domain values with sums and products in such a way that no nulls or optionals are required. Do not try this in a language with such a basic type system as Java or even Kotlin though, you will run into the limits of the type system almost immediately.
sealed interface Option<T> permits Some<T>, None<?> {}
record Some<T>(T value) implements Option<T> {}
record None() implements Option<T> {
static <T> None<T> none() { return new None<>(); // can also be a single instance
}
}
The only less than ideal part is that None needs the generic type, but that can be easily circumvented by adding a generic helper method. You can add all the Monad goodies to the Option interface and you will even get exhaustive switch cases with pattern matching. The only thing Java’s type system can’t express is abstracting those Monad goodies, but it can absolutely implement them on a case-by-case basis.The fundamental problem in Java is, however, what you stated in your last sentence: you are limited in abstraction, in most cases you have to implement the specifics.
I never found the talk online, but here's the speaker deck: https://speakerdeck.com/mariofusco/monadic-java
1. They are not composable (can't map or flatmap or fold/reduce them).
Kotlin helpfully added mapNotNull() and similar methods. 2. you can't return an error value, only the fact that there is no value
Yes I much prefer Rust-like return values, the non-local control flow of exceptions leads to convoluted code and improper error handling.Is Kotlin better because it works out of the box or are there differences in the feature set?
But regarding the feature I imagine it is the same. Or are there cases where the Java Null Analysis fails?
Disagree. If you don't PUT the nulls into the language, you don't need a brigade of PhDs to develop the static analysis to tell you whether you have nulls.
I'm sick of worshipping at the altar of backward compatibility. Just because we used to choose to include nulls doesn't mean we need to keep choosing to include them.
Instead of having to wrap an optional value, you just annotate the type as being the union of something _or_ null. You get the same guarantees, but it actually composes openly instead of having to create a closed, specific construct that enumerates variants.
No.
(The reason it works is because Java doesn't actually allocate new Longs for small numbers, it fetches them from a table of constants; it's always the same 256 objects that are being dereferenced. I don't know their memory layout, but I'd half expect them to be sequential in memory as that's would be much a low hanging fruit optimization. Optional<Long>'s performance is what you'd expect without these optimizations. Also in this scenario you really should use OptionalLong instead of Optional<Long> but that's beside the point ;-)
Normal usage of this stuff is not going to cause any more issues than trigger the OCD of people that obsess about this stuff. And in the rare case that you do have an issue, you can do something more optimal indeed.
What this means is that in practice, pointer chasing is less of an issue than you’d expect. Even a linked list will end up with decent cache locality.
Obviously this won’t always work, but it generally works a lot better than the same structure in a systems language.
https://shipilev.net/jvm/anatomy-quarks/11-moving-gc-localit...
You can get away with referencing the data through (mutable and reusable) pointer objects that reference memory mapped areas yet provide a relatively comfortable higher level interface. This gets rid of object churn while keeping a relatively sane interface.
There are plenty of fast performing databases and other middleware written in Java. The JVM is a popular platform for that kind of thing for a good reason. Writing good software of course is a bit of a skill. Benchmarks like this are kind of pointless. Doing an expensive thing in a loop is slow. Well duh. Don't do that.
System.out.println(Integer.valueOf(22) == Integer.valueOf(22)); // true
System.out.println(Integer.valueOf(2200) == Integer.valueOf(2200)); // false
Which is a bit confusing to say the least. I realize one should never be using == with objects, but still.
x = 2
print(x is 2) // true
x = 200
print(x is 200) // falseIn Python the cases where you’d be using `is` are a lot more restricted, and a bit of an optimisation (although most style guides will require it in the cases where it makes sense).
There’s basically 3 cases where you’d use `is` in Python:
- the standard singletons None, True, and False (and for the latter two you’d usually use truthiness anyway)
- placeholder singletons (e.g. for default values when you need something other than None)
- actual identity check between arbitrary objects (which very rarely happens IME)
In nutshell, on my old i5 2500k:
ints 0.6ns/op
cached Integer 0.6ns/op
boxed Integer 1.3ns/op
OptionalInt 3.5ns/op
Optional<Integer> 4.2ns/op (time includes boxing the int)
Where an op is getting the number, checking it, then an addition.
For hot loops inside a jmh benchmark, you can use @OperationsPerInvocation(MAX) and it will spit out the results in this more readable format for the time just inside the loop.
Note that "Long" in Java can be null because it is boxed, "long" (lowercase) however cannot be null, but it also can't be Optional<long>. Java sucks :)
EDIT: I'd love to see a C# version of this.
Rust has optionals built into the language. Rust's philosophy is to be a super powerful tool, language complexity be damned.
I find it hard to say java sucks in this context. Each language is making trade offs that align with their vision.
enum Option<T> {
Some(T),
None,
}
What makes Rust fast here is that it has value types and can optimise them.I think using primitive types as generics is something that makes Java less ergonomic than C# (where they’re called unmanaged types), whether it is considered justified or necessary.
To say Java sucks because of this is a bit much. To say Java sucks because you can’t avoid null is definitely warranted. (You can say good things about Java, and not being able to opt out of nulls is not one of them.)
This is not Rust's philosophy at all.
GNU Trove is a collection library that focuses on optimizing for primitive types and is significantly faster that Java collections which require boxing.
Except primitive types like long in this case, which are not objects. This was a performance-consistency tradeoff made in the early 90s. It made sense at the time and now doesn't make sense to some people, but that's ok. I wouldn't say Java sucks because of that either. Now type erasure, that's a different topic.
Everything except primitive types, functions, and arrays (of any type). The different status of arrays can be a real pain.
Ruby says the same thing, and they're even worse about functions behaving differently than objects do.
> The task was to compute a sum of all the numbers, skipping the number whenever it is equal to a magic constant. The variants differ by the way how skipping is realized:
> 1. We return primitive longs and check if we need to skip by performing a comparison with the magic value directly in the summing loop.
> 2. We return boxed Longs and we return null whenever we need to skip a number.
> 3. We return boxed Longs wrapped in Optional and we return Optional.empty() whenever we need to skip a number.
Seems pretty reasonable to me.
First having to declare the value in the one type of four that makes least sense, then praying that the compiler optimizes the allocation of not one but TWO(!) objects(!) in order to represent "maybe a number" is basically why I ragequit Java almost 20 years ago.
Java’s tradeoffs are maintainability in huge teams over multiple years with relatively fast performance even if you write your code very naively, with top notch tooling, observability, etc. In the rare case you have to optimize in the hot loops you can allow to have less readable code like I mentioned.
Without Valhalla
- OptionBenchmark.sumSimple avgt 5 328,110 us/op
- OptionBenchmark.sumNulls avgt 5 570,800 us/op
- OptionBenchmark.sumOptional avgt 5 2223,887 us/op
- OptionBenchmark.sumOptionalLong avgt 5 1201,987 us/op
With Valhalla
- OptionBenchmark.sumSimpleValhalla avgt 5 327,927 us/op
- OptionBenchmark.sumNullsValhalla avgt 5 584,967 us/op
- OptionBenchmark.sumOptionalValhalla avgt 5 572,833 us/op
- OptionBenchmark.sumOptionalLongValhalla avgt 5 326,949 us/op
OptionalLong is now as fast as simple sum. And SumOptional is now as fast as SumNulls. So the overhead of using OptionalLong and Optional<Long> seems to have gone away with Valhalla.
It would be great if boxing could be eliminated as well. But few people writes code like what is being benchmarked (in hot loops) in practice.
Pre Valhalla | Valhalla
sumSimple 328,110 us/op | 327,927 us/op
sumOptionalLong 1201,987 us/op | 326,949 us/op
sumNulls 570,800 us/op | 584,967 us/op
sumOptional 2223,887 us/op | 572,833 us/opsumNullsValhalla and sumOptionalValhalla returns 584,967 us/op and 572,833 us/op respectively
sumSimpleValhalla and sumOptionalLongValhalla returns 327,927 us/op and 326,949 us/op respectively
- OptionBenchmark.sumOptional avgt 5 2223,887 us/op
vs
- OptionBenchmark.sumOptionalValhalla avgt 5 572,833 us/op
At least to me after the first read it seemed like comparison of two similair 5kish values.
Sum Optional long is about 4x faster. 1.2Million microseconds per operation (pre Valhalla) vs 300K microseconds per operation (post Valhalla).
Before the recent generics everyone wrote Golang like Java without boxing and don't complain so why not :).
Why would Rust be cheating here? Java cannot make these types of optimizations yet (though they are likely coming with Project Valhalla) but that doesn't mean Rust should be similarly handicapped in benchmarks.
Java has many smart optimizations and advantages over Rust (being garbage collected for one, making it much easier to write code in, and runtime reflection, a blessing and a curse) and with tricks like rearranging objects to make more effective use of CPU caches you can end up writing Java that's very close in performance to native, precompiled code.
However, when it comes to raw performance, you shouldn't expect the standard JVM to come close to Rust. There is inherent overhead in the way the language and the runtime are designed. There is no "cheating" here, the algorithms are the same and some languages just produce more efficient code in these scenarios. You wouldn't slow down the JVM to make the benchmark fair for a Python implementation either!
A more interesting comparison may be compiling Java to native assembly (through Graal for example) so Java too can take advantage of not having to deal with reflection and using SIMD instructions.
Alternatively, a Java vs C# rundown would also be more interesting, as both languages serve similar purposes and solve similar problems. C#'s language-based approach to optional values has the potential to be a lot faster than Java's OOM-based approach but by how much remains to be seen.
Java vs Kotlin may also be interesting to benchmark to see if the Kotlin compiler can produce faster code than Java's Optional; both run inside the same JVM so the comparison may be even better.
In fact it is mostly the opposite, all the Kotlin concepts that don't exist in Java (the language), need additional bytecodes to fake their semantics on top of JVM bytecodes optimized for Java semantics.
Like functions, lazy initializations, delegation, or co-routines.
Do you mean, "javac can also implement them if it is modified to do so"? Because you are also making the case that Kotlin is syntax sugar on top of Java, when it is actually a bytecode-generating compiler in its own right, so I'm not sure how to understand this comment.
Anyone else has to generate boilercode to pretend the semantics expected by those bytecodes, was easily shown via javap tooling on .class files.
So this test is as “unfair” as benchmarking Rust’s allocation performance against Java, for example
Either[null, T]
but a Either[null, boxed T, boxed null]
‘Boxed null’ is what the documentation (https://docs.oracle.com/javase/8/docs/api/java/util/Optional...) calls “an empty Optional”That means that, for example, an Optional[Byte] can have 258 different values and cannot, in general, be compiled to a ”pointer to byte” because that has only 257 different values.
Edit: reading https://news.ycombinator.com/item?id=35133241, the plan is to change that. I fear that, by the time they get around to that, lots of code will handle the cases null and Optional containing null differently, making that a breaking change.
The post your link links to explains exactly how they intend to avoid this problem.
IMO, that’s not solving the problem, but doing the best you can once you’ve decided to implement Optional as a reference class now and as a value class at some future time.
I think I would have waited for the proper implementation.
For what it's worth, Java's also got this class: https://docs.oracle.com/javase/8/docs/api/java/util/Optional...
Although in practice there isn't much performance difference in my experience.
Yes, I was working with code once which wrapped string IDs into a FooId object (a good idea in principle) and all of the following had different meanings:
FooId x = null;
FooId x = new FooId(null);
FooId x = new FooId("");
FooId x = new FooId(...an actual ID...)
I think one was for not showing any content at all, one was for showing default content, another was that content was there but the user wasn't allowed to see it, etc.I'm so glad I left that company...
Optional.ofNullable(null) == Optional.empty()
Wasn't Java supposed to get support for Value types some time ago?
Nullable<T> itself is not exactly the same as Option<T> since it does not cover types that already allow nulls. It adds nulls rather then removing them.
Many people roll their own Option<T> or Result<T,E> type, since it's easy enough to start, and it's usually a class type.
1) https://learn.microsoft.com/en-us/dotnet/api/system.nullable...
2) https://github.com/mcintyre321/OneOf/blob/master/OneOf/OneOf...
https://sharplab.io/#v2:EYLgtghglgdgNAFxBAzmAPgAgLACgACATAMx...
sum += things[i] ?? 0;e.g. Option<NonZeroU64> is effectively encoded and operated on as u64, but it gives the type system a way to make sure you correctly handle the case where "0" means something special for you
NonMinI32 could also be interesting as a symmetrical number type, representing [−2³¹ + 1, 2³¹ − 1] and leaving the bit pattern 0x80000000 for niche optimisations.
But back then we hadn't had const generics and it was time wise too far off to wait for const generics in anyform (including unstable rustc internal only usage of it).
So if now that we have const generics somone sits downs discusses the technical details on zulip, then writes a RFC and then writes an implementation we theoretically could have it soon.
Through I'm not sure how easy/hard the implementation part would be.
Some problems to discuss for standardization would be:
- is there any in progress work, overlapping RFC etc. (Idk. there should be older in progress work, but someone might be working on it right no idk). There could also be work on a more generic niche handling code which would happen to also cover this idk.
- should multiple niches be handled and if so how with which limitations (there are no variadic generic and ways to emulate them like through type nesting likely wouldn't have pef and complexity problems)
- can it be usefull for outside of optimizations to have e.g. a range limited integer
- if the gap is big enough (i.e. u32 limited to a hypothetical u24), should it interact with packed representation
- is there any risk of it being confusing/unexpected (should not be the case, but still needs to be evaluated)
EDIT: There seem to be unstable following attributes:
#[rustc_nonnull_optimization_guaranteed] #[rustc_layout_scalar_valid_range_start(...)] #[rustc_layout_scalar_valid_range_end(..)]
Today you can do this in nightly Rust, using a deliberately permanently unstable attribute, that's what my nook crate does to produce e.g. BalancedI8 which is a signed byte from -127 to 127. It will be nice when some day Pattern Types, or an equivalent are stabilized.
“Our intention was to provide a limited mechanism for library method return types where there needed to be a clear way to represent "no result", and using null for such was overwhelmingly likely to cause errors.
For example, you probably should never use it for something that returns an array of results, or a list of results; instead return an empty array or list. You should almost never use it as a field of something or a method parameter.
I think routinely using it as a return value for getters would definitely be over-use.
”This also applies to null/Maybe as well: both would violate the principle of least surprise (e.g. the AWS DynamoDB SDK has queries return an 'Array<Item>'; but this is 'null' if there are no matches!). It also complicates the domain model, making two distinct forms of empty value ('None' versus 'Some(List())'; or '[]' versus 'null'), which may not have any semantic difference.
> You should almost never use it as a field of something
I agree, although it's often preferable to expose methods rather than fields anyway; in which case it's a return value, which seems OK.
> or a method parameter
Sure, that's what polymorphism/overloading is good for, e.g. instead of `foo(int arg1, Optional<String> arg2)` we can have separate `foo(int arg1, String arg2)` and `foo(int arg1)` definitions (where the latter will probably call the former with some default).
> I think routinely using it as a return value for getters would definitely be over-use
I agree, since that would indicate our model is too weak, and missing some domain-relevant information. For example, if many of our 'Order' methods return optional results, there's probably a finer-grained distinction to be made, like 'PendingOrder', 'FulfilledOrder', etc. which don't need the optional qualifiers.
(Personally I try to avoid the term "getter": APIs should make sense without reference to their underlying implementation; whether that happens to be "getting" a field, or calling out to some other methods/objects, etc. That's the point of encapsulation :) )
Say how a general-purpose Maybe type in Java would be different than Optional.
"We made a vehicle with an engine and four wheels but never intended it to be a car."
Optional<T> a = null;
Optional<T> b = Optional.empty();
In other words, Optional would have to be a non-nullable type. Which of course means that, to have it be a reference type, Java would have to support non-nullable reference types. But if Java did support those, you wouldn’t really need Optional in the first place, because then the current nullable types would fulfill that purpose.https://cr.openjdk.org/~briangoetz/valhalla/sov/02-object-mo...
In some languages like TS, PHP or Kotlin have proper unions that you just handle with branching.
Rust lets you pattern match againt a construct that holds a value or doesn’t. Option is an actual thing there that you need to unpack in order to look inside.
In Clojure nils are everywhere. They tell you that “you’re done” in an eager recursion, or that a map doesn’t have something etc. Many functions return something or nil, and depending on what you’re doing you care about the value vs the logical implication.
nils flow naturally through your program and it’s not something you are worried about, as many functions do nil punning. Well as long as you don’t directly deal with Java - then you have to be more careful.
foo?.Bar(baz);
and var result = foo?.Bar(baz);
both do what's expected: skip the method call and return null (or void) if the receiver is null, and the compiler complains if you don't do that when foo is inferred to be nullable.The function
fn get_optional_non_zero(n: u64) -> Option<NonZeroU64>
let i = n & 0xFF;
if i == MAGIC { None } else { NonZeroU64::new(i) }
}
Actually returns None for n = 0 or n is any multiple of 256.The resulting usage in the sum still yields the same result, because skipping zeros in an addition doesn't matter, but it is a subtle difference between this get-function compared to all of the others. It also doubles the number of None cases the code needs to handle.
Either way, glad to see that Rust is doing a good job eliminating the overhead. I'm not sure if arithmetic is the right kind of benchmark here, but it'd probably be difficult to measure the performance overhead across "real" codebases, so focusing on a tight loop microbenchmark is probably fine.
Discussion at publication time: https://news.ycombinator.com/item?id=28887908
This article resonate in me with the recent articles of Casey Muratori about non-pessimistic code:
Within the realm of the module, don't use pessimistic code (avoid Boxing) _but_ that doesn't prevent you to provide a safe API. E.g. the result of the loop could be wrapped if that made sense.