0.1 + 0.2 != 0.3
You can check it in the JavaScript console.This actually makes me wonder if anyone's ever attempted a floating-point representation that builds in an error range, and correctly propagated/amplified error over operations.
E.g. a simple operation like "1 / 10" (to generate 0.1) would be stored not as a single floating-point value, but really as the range between the closest representation greater than and less than it. The same with "2 / 10", and then when asking if 0.1 + 0.2 == 0.3, it would find an overlap in ranges between the left-hand and right-hand sides and return true. Every floating-point operation would then take and return these ranges.
Then floating point arithmetic could be used to actually reliably test equality without ever generating false negatives. And if you examined the result of calculation of 10,000 operations, you'd also be able to get a sense of how off it might maximally be.
I've search online and can't find anything like it, though maybe I'm missing an important keyword.
Yes, that turns out to be exactly it [1]. Looks like there's even at least one JavaScript library for it [2].
It seems like such a useful and intuitive idea I have to wonder why it isn't a primitive in any of the common programming languages.
The problem is that the user wants to write 1/10 and 2/20 and 3/10 but those numbers aren't really in the binary system.
The user gets some numbers (let's call them A, B and C) that aren't the same but they fool people at first because they not only deserialize as 0.1 but the they also serialize from 0.1. Trouble is that A + B != C but some other number.
Excel tries to hide it but the real answer is to keep the exponent in base 10 if you plan to read and write numbers like 137.036 or 9.1E-31. How the mantissa is doesn't matter, it could be base 7 for all I care -- it is just an integer.
Interval math is for much tougher problems like recursion of
k * x * (1-x)
is easily proven to have periodic orbits of infinitely long period, but if you are using 32-bit floats you can't have a period longer than 4 billion. That kind of qualitatively difference means that there's no scientific value in iterating that function with floats, although you can do accurate grid samples with interval arithmetic.The flip side is that you generate plenty of false positives once your error ranges get large enough. This happens pretty readily if you e.g. perform iterations that are supposed to keep the numbers at roughly the same scale.
x---0.1 + 0.2 ---x
x---0.3---x
That is, the range of 0.1 + 0.2 would be wider than the range of 0.3. And now what do you do? There is overlap, so are they equal? But there are parts that don't overlap, so are they different?For me, floating-point equality would be if there are any parts that overlap. Basically "=" would mean "to the extent of the floating-point accuracy of this system, these values could be equal".
If you're doing a reasonably limited number of operations with values reasonably larger than the error range, then it would meet a lot of purposes -- you can add 0.5 somewhere in your code, subtract 0.5 elsewhere, and still rely on the value being equal to the original.
The equality test in floating point numbers is comparing against the epsilon.
Math.abs(0.3 - (0.1 + 0.2)) < Number.EPSILON
Which is the same you other languages.Using the epsilon for comparison is not mentioned in the article. Floating point absorption is also not mentioned in the article.
This entire discussion and the fact this is on the front page of HN is pretty disappointing and sad.
Is this really a surprise for you? if it is... have you ever implemented any logic involving currency? You may want to take another look at it.
Math.abs(1.8 - (0.1 + 0.2 + 0.9 + 0.6)) < Number.EPSILON
returns false.Also, you generally really shouldn't be implementing any currency logic using floating point numbers, yikes. Stick to integers that represent the value in cents, or tenths of cents, or similar. Or, even better, a DECIMAL data type if your platform supports it.
I genuinely hope you've never written financial software that judges if the results of two calculations are equal via the method you've described.
Floating point arithmetic is good enough for science, should be good enough for commerce too, no? Why is commerce special?
0.30000000000000004 - https://news.ycombinator.com/item?id=21686264 - Dec 2019 (402 comments)
0.30000000000000004 - https://news.ycombinator.com/item?id=14018450 - April 2017 (130 comments)
0.30000000000000004 - https://news.ycombinator.com/item?id=10558871 - Nov 2015 (240 comments)
0.30000000000000004 - https://news.ycombinator.com/item?id=1846926 - Oct 2010 (128 comments)
Resisting temptation to list floating-point math threads because there are so many:
https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
This is the TXR Lisp interactive listener of TXR 256.
Quit with :quit or Ctrl-D on an empty line. Ctrl-X ? for cheatsheet.
TXR works even if the application surface is not free of dirt and grease.
1> (+ 0.1 0.2)
0.3
OK, so then: 2> (set *print-flo-precision* 17)
17
3> (+ 0.1 0.2)
0.30000000000000004
But: 4> 0.1
0.10000000000000001
5> 0.2
0.20000000000000001
6> 0.3
0.29999999999999999
I.e. 0.1 isn't exactly 0.1 and 0.2 isn't exactly 0.2 in the first place! The misleading action is to compare the input notation of 0.1 and 0.2 to the printed output of the sum, rather than consistently compare nothing but values printed using the same precision.The IEEE double format can store 15 decimal digits of precision such that all those decimal digits are recoverable. If we print values to no more than 15 digits, then things look "artificially clean" for situations like (+ 0.1 0.2).
I made *print-flo-precision* have an initial value of 15 for this reason.
The 64 bit double gives us 0.1, 0.2 and 0.3 to 15 digits of precision. If we round at that many digits, we don't see the trailing junk of representational error.
Unfortunately, to 15 digits of precision, the data type gives us two different 0.3's: the 0.299999... one and the 0.3.....04 one. Thus:
7> (= (+ 0.1 0.2) 0.3)
nil
That's the real kicker; not so much the printing. This representational issue bites you regardless of what precision you print with and is the reason why there are situations in which you cannot compare floating-point values exactly.I think the problem is the act of caring for the least significant bits.
If you care for least significant bits of a floating point number it means you are doing something wrong. FP numbers should be treated as approximations.
More specifically, the problem above is assuming that floating point addition is associative to the point of giving you results that you can compare. In floating point order of operations matters for the least significant bits.
FP operations should be treated as incurring inherent error on each operation.
IEEE standard is there to make it easier to do repeatable calculations (for example be able to find regression in your code, compare against another implementation) and for you to be able to reason about the magnitude of the error.
Pencil-and-paper floating-point numbers like 1.23 x 10^5 are approximations of measurements (if we are doing science or engineering), but are inherently exact. Calculators bear that out, because calculators use base 10 floating-point, like pencil-and-paper calculations.
0.3 being inexact is only an artifact of the floating-point system being in a different base. No matter how many digits we throw at it, we cannot represent 0.3 in binary floating point. Not 64 bits, not 1024 bits, not 65535 bits.
If we use binary notation for floating-point numbers, they likewise become exact, in terms of representation. The inexactness we deal with then is the familiar type that we know from pencil-and-paper calculations: truncation to a certain number of digits after performing an operation like addition or multiplication.
But that truncation will not happen in a calculation in which both input operands are exactly represented, and the result is also exactly representable!!!
If base ten were used, 0.1 + 0.2 would be 0.3, exactly.
If we use power-of-two values, and combinations thereof, we don't have the problem:
1> (= 0.625 (+ 0.125 0.5))
t
No problem. 2> (set *print-flo-precision* 17)
17
3> 0.625
0.625
4> 0.5
0.5
5> 0.125
0.125
6> (+ 0.125 0.5)
0.625
No junk digits. 7> (= 0.25 (sqrt 0.0625))
t
Wee ...From what I’ve seen with most recent grads, the education is shifting more and more towards algorithms, with experience mostly involving the use of existing libraries/frameworks, rather than lower level implementations that us “old timers” were forced to implement ourselves, thanks to lack of accessibility to freely usable code. I think GitHub, StackOverflow, and Google have changed the mental model of software development, significantly. I don’t think that’s a bad thing at all since it should free up some beans, especially for someone new to the field.
Not knowing this will bite you eventually, but it’s fairly trivial to work out.
"SELECT .1 + .2;" does return 0.3
However,
CREATE TABLE t1 (f FLOAT);
INSERT INTO t1 VALUES(0.1),(0.2);
SELECT SUM(f) FROM t1;
// returns 0.30000000447034836
Which feels odd to me.https://en.wikipedia.org/wiki/Numerical_tower
One change I'd consider making to Scheme, and to most high-level general-purpose languages (that aren't specialized for number-crunching or systems programming), is to have the reader default to reading numeric literals as exact.
For example, the current behavior in Racket and Guile:
Welcome to Racket v7.3.
> (+ 0.1 0.2)
0.30000000000000004
> (+ #e0.1 #e0.2)
3/10
> (exact->inexact (+ #e0.1 #e0.2))
0.3
So, I'd lean towards getting the `#e` behavior without needing the `#e` in the source.By default, that would give the programmer in this high-level language the expected behavior.
And systems programmers, people writing number-crunching code, would be able to add annotations when they want an imprecise float or an overflowable int.
(I'd also default to displaying exact fractional rational numbers using familiar decimal point conventions, not the fractional form in the example above.)
As an example of a Scheme-ish `#lang`, here's a Racket `#lang sicp` that I made to mimic MIT Scheme, as well as add a few things needed for SICP: https://github.com/sicp-lang/sicp/blob/master/sicp/main.rkt
It would be even easier to make a `#lang better-scheme`, by defining just a few changes relative to `racket-base`, such as how numbers are read.
For HN, I'd like to point out that it was a historical accident that Java looked like it did, as far as the Web was concerned.
IIRC, Java looked like it did to appeal to technical and shrinkwrap developers, who were using C++ or C. (When I was lucky to first see Java, then called Oak, they said it was for embedded systems development for TV set-top boxes. I didn't see Java applets until a little later.)
But the Web at the time was intended to be democratizing/inclusive (like BASIC, HyperCard, and Python). And the majority of the professional side was closer to what used to be called "MIS" development (such as 4GLs, but not C/C++). And in practice, HTML-generating application backends at the time were mostly written in languages other than C/C++.
I'm sympathetic to the rebranding of the glue language for Java applets (and for small bits of dynamic), to be named like, and look like, Java. That made sense at the time, when we thought Java was going to be big for Web frontend (and I liked the HotJava story for a thin-client browser extended on-demand with multimedia content handlers). And before the browser changed from hypertext navigator to GUI toolkit.
But it's funny that we're all using C-descendant syntax only through a series of historical accidents, when that wasn't even what the programmers at punctuated points in its adoption actually used (we only thought it would be, at the time the decisions were made).
For example: just treat numbers as strings and write code that adds the digits one by one and does the right carries
Now that I think about it, is this the whole point of the Java BigDecimal class?
> all.equal(0.1+0.2,0.3)
[1] TRUE
and functions for actual equality, e.g. > identical(0.1+0.2,0.3)
[1] FALSEand see me in the morning.
Coq < Compute 0.1.
Toplevel input, characters 8-11:
> Compute 0.1.
> ^^^
Warning: The constant 0.1 is not a binary64 floating-point value. A closest
value 0x1.999999999999ap-4 will be used and unambiguously printed
0.10000000000000001. [inexact-float,parsing]
= 0.10000000000000001
: floatWith proper rounding and I/O these are not generally an issue.
Specifically I'm thinking about python, the literal x.x should be for Decimal and float should have to be imported to be used as an optimization if you need it.
Complexity wise that actually seems to give an equally simple "shortest answer" method - nextafter up and down and using text processing find the first digit that changes, see if it can be zero, if not choose the lowest value it can be an increment by one, remove the rest of the string accordingly, and right trim any 0s from the resulting.
>>> 0.1 + 0.2
0.30000000000000004
That's the expected behavior of floating-point numbers, more specifically, IEEE 754.If you don't want this to happen, use fixed-point numbers, if they're supported by your language, or integers with a shifted decimal point.
Personally, I think if you don't know this, it's not safe for you to write computer programs professionally, because this can have real consequences when dealing with currency.
http://pages.cs.wisc.edu/~david/courses/cs552/S12/handouts/g...
What controls this rounding?
e.g., in an interactive python prompt i get:
>>> b = 0.299999999999999988897769753748434595763683319091796875
>>> b
0.3[1] See my past comment for the overview: https://news.ycombinator.com/item?id=26054079
>>> f'{b:.54f}'
0.299999999999999988897769753748434595763683319091796875
>>> f'{x:.16g}'
0.3
>>> f'{x:.17g}'
0.29999999999999999 > 1.1 + 2.2
3.3
> 1.1 + 2.2 == 3.3
True
EDIT: to be clear: this is not because Raku is magic, it's because Raku defaults to a rational number type for decimal literals, which is arguably a much better choice for a language like Raku. * (= (+ 0.1 0.2) 0.3)
T
In Common Lisp, there is a small epsilon used in floating-point equality: single-float-epsilon. When two numbers are within that delta, they are considered equal.Meanwhile, in Rakudo, 0.1 is a Rat: a rational number where the numerator and denominator are computed.
You can actually get the same underlying behavior in Common Lisp:
(= (+ 1/10 2/10) 3/10)
Sadly, not many recent languages have defaults as nice as those. Another example is Julia: julia> 1//10 + 2//10 == 3//10
true
IMO, numerical computations should be correct by default, and fast in opt-in.Edit: it seems that Raku uses rationals as a default [1], so it doesn't suffer from the same problem by default.
Yeah, exactly, Raku defaults to a rational number type for these kinds of numbers. I honestly think that is a perfectly fine way to do it, you're not using Raku for high performance stuff anyway. It's not so different from how Python will start to use arbitrarily sized integers if it feels it needs to.
Raku by default will convert it to a float if the denominator gets larger than a 64-bit int, but there's actually a current pull request active that lets you customize that behavior to always keep it as a Rat.
Really interesting language, Raku!
0.1e0 + 0.2e0
yields 0.30000000000000004. Your example also fails 1.1e0 + 2.2e0 == 3.3e0
returns false.1.99999999.... == 2.0
There are limits to computer representation of floating point numbers. Computers are finite state, floating point numbers are not.
sigh
No, floating point numbers are finite state. That’s the whole point behind this discussion. There are only so many possible floating point numbers representable in so many bits.
I never understand this confusion - you have finite memory - with this you can only represent a finite set of real numbers. So of course all the real numbers can’t be mapped directly.
This confusion is also helped along by the fact that the input and output of such numbers is generally still done in decimal, often rounded, that both decimal and binary can exactly represent the integers with a finite number of digits, and that the set of numbers exactly representable with in a finite decimal expansion is a superset of those exactly representable in a finite binary expansion (since 2 is a factor of 10).
bc <<< "0.1 + 0.2"
.3
bc <<< "1.0E4096 +1 -1.0E4096"
1.000000
node -e "console.log(1.0E128 +1 -1.0E128)"
0
python -c "print(1.0E128 +1 -1.0E128)"
0.0
The focus should be on _rational_ numbers. This particular example is all about representation error - precision is implicated, but not the cause.
Ignore precision for a second: The inputs 0.1 and 0.2 are intended to be _rational_. This means they can be accurately represented finitely (unlike an irrational number like PI). Now when using fractions they can _always_ be accurately represented finitely in any base:
1/10=
base 10: 1/10
base 2: 1/1010
2/10=
base 10: 2/10
base 2: 10/1010
The neat thing about rationals, is that when using the four basic arithmetic operations: two rational inputs will always produce one rational output :) this is relevant: 1/10 and 2/10 are both rationals, there is no fundamental reason that addition cannot produce 3/10. When using a format that has no representation error (i.e fractions) the output will be rational for all rational inputs (given enough precision, which is not a realistic issue in this case). When we add these particular numbers in our heads however, almost everyone uses decimals (base 10 floating point), and in this particular case that doesn't cause a problem, but what about 1/3?This is the key: rationals cannot always be represented finitely in floating point formats, but this is merely an artifact of the format and the base. Different bases have different capabilities:
1/10=
base 10: 0.1
base 2: 0.00011001100110011r
2/10=
base 10: 0.2
base 2: 0.00110011001100110r
1/3=
base 10: 0.33333333333333333r
base 2: 0.01010101010101010r
IEEE754 format is a bit more complicated than above, but this is sufficient to make the point.If you can grok that key point (representation error), here's the real understanding of this problem:
Deception 1: The parser has to convert '0.1' decimal into base 2, which will cause the periodic significand '1001100110011' (not accurately stored at any precision)... yet when you ask for it back, the formater magically converts it to '0.1' why? because the parser and formater have symmetrical error :) This is kinda deceptive, because it makes it look like storage is accurate if you don't know what's going on under the hood.
Deception 2: Many combinations of arithmetic on simple rational decimal inputs also have rational outputs from the formatter, which furthers the illusion. For example, nether 0.1 or 0.3 are representable in base 2, yet 0.1 + 0.3 will be formatted to '0.4' why? It just happens that the arithmetic on those inaccurate representations added up to the same error that the parser produces when parsing '0.4', and since the parser and formatter produce symmetric error, the output is a rational decimal.
Deception 3: Most of us grew up with calculators, or even software calculator programs. All of these usually round display values to 10 significant decimals by default, which is quite a bit less than the max decimal output of a double. This always conceals any small representation errors output by the formatter after arithmetic on rational decimal inputs - which makes calculators look infallible when doing simple math.
Edit: this is not a blanket statement. It was meant in the context.
Floating point is extremely useful. Too bad so many people have no idea how and when to use it. Including some people that design programming languages.
Please, tell me, mister, how would you perform complex numerical calculations efficiently?
I guess we should just forget about drones and bunch other stuff because 90% of developers have no clue how to use FP?
var n = get_n() // valid to .5g
n = transform(n)
...
<input value={n.toFixed(5)}> // 5 carried over here
You can’t even infer the precision from a FP number alone, especially if it is close to log10(53). /editIn a proper-numbers lang, if someone needed FP numbers, they could just 0.1f. Otherwise 0.1 would mean just that, and counting by 0.1+rand(100) from 1000000 to 0 would not make you scratch your head at the end of the loop and worry whether the rest is just a FP error or an algorithmic error which must be fixed.
90% of developers who know how to use FP still hate it in non-FP tasks, because there is no 0.1nobs literal, how about that.
If your calculation turned out to be incorrect it doesn't matter if it's efficient. Correct FP calculation requires error analysis, which is a concrete definition of "how to use it". If you mostly use packaged routines like LAPACK, then you don't exactly need FP; you need routines that internally use FP.