Why Python's integer division floors (2010) (opens in new tab)

(python-history.blogspot.com)

103 pointsbshanks2y ago98 comments

98 comments

55 comments · 11 top-level

nwellnhof2y ago· 11 in thread

It should be noted that in C89, both behaviors of division and modulo are allowed and it's implementation-defined whether division results are truncated or rounded towards negative infinity. This has changed in C99 which only allows truncation towards zero.

Karellen2y ago

I've always considered it a mistake to refer to "%" as a "modulo" operator in C because of its ability to return negative numbers. That's not how modulo arithmetic works. C's "%" is a remainder operator.

marko_oktabyr2y ago

> That's not how modulo arithmetic works.

We should be a bit careful with what one means by "modulo arithmetic" here. If we are talking about arithmetic in Z/nZ (read "integers modulo n"), then the objects being acted upon are no longer integers, but equivalence classes of integers. That is, the set of all integers that satisfy the equivalence relation "~" where "a ~ b" means a - b = k*n for some integer k. For example, one of the equivalence classes in Z/3Z would be the set {..., -4, -1, 2, 5, ...}.

Since every element of the set is equivalent under that relation, we typically will choose a representative of that equivalence class, e.g., [2] for the previous class. If we view the "mod" or "modulo" operator to be a mapping from the integers to a particular representative of its equivalence class, there's no reason that negative values should be excluded. [-1] refers to the same equivalence class as [2], [32], and [-7]. The details of how integers are mapped to a chosen representative seem to vary from language to language, but modular arithmetic works all the same between them.

1 more reply

seanhunter2y ago

Neither of them is wrong. Languages (like python) which only return a positive number are returning the least positive residue of division under the modulus. That is the most common interpretation of it in number theory at least far as I know.

C generally does return negative numbers when you use the % operator.

kazinator2y ago

Under modulo math, -1, 9 and 19, for example, are all the same element of the modulo 10 congruence. 9 is called the "least positive residue", which is sometimes important.

matsemann2y ago

I have to re-learn this each year for Advent of Code, when inevitable my something % anything goes into the negative when searching downwards.

marcosdumay2y ago

That's not any clearer. Both "modulo" and "remainder" can be defined in ways (yeah, more than one) that include or exclude negative numbers.

You have to memorize the full definition. There's no mnemonic shortcut.

1 more reply

pwdisswordfishc2y ago

Exactly. Modulo is not an operator, but a term for an equivalence class of differing by a multiple of a number fixed in advance.

4 more replies

Findecanor2y ago

One thing that is still "implementation-defined" in C and C++ is the result of a right shift of a negative integer. On pretty much all platforms, the >> operator shifts in the sign bit and does not round the result — which makes it equivalent to flooring division by a power of two. It is consistent with the division operator only when the left-hand value is positive.

kazinator2y ago

However, the oposite shift is undefined if a 1 goes into the sign bit.

More precisely, regarding E1 << E2, it is written (in the April 2023 ISO C draft):

"If E1 has a signed type and nonnegative value, and E1 × 2ᴱ² is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined."

Thus if E1 is negative, or if the result overflows, UB.

1 more reply

kragen2y ago

mentioned in the comments on the post, which merit reading in this case

i haven't read the rationale, but presumably the committee did this because it's what virtually all cpus do (because it's what fortran does), so it's the only thing that can be implemented efficiently on virtually any cpu with a division instruction, and so virtually all c implementations did it, and standardizing the behavior of virtually all c implementations is better than leaving it implementation-defined or standardizing a behavior that conflicts with virtually all existing implementations and can't be implemented efficiently on most high-end hardware

DonHopkins2y ago

Who would have ever thought that division could be so ... divisive?

1 more reply

z_open2y ago· 9 in thread

Changing well known behavior for something no one is really going to need. The justification makes sense, but it breaks convention and the relationship with modulo doesn't need to hold for negative numbers.

Skeime2y ago

I strongly disagree. I would estimate that in 90% of cases where I use modulo in languages that truncate (instead of flooring), I write (a % b + b) % b, or something similar, just to get the right behaviour. The exceptional cases are those where I can convince myself that negative numbers simply won't come up. (It's never because I actually want the other behaviour for negative numbers.)

- When using modulo to access an array cyclically. (You might get lucky that your language allows using negative numbers to index from the back. In that case, both conventions work.) - When lowering the resolution of integers. If you round to zero, you get strange artifacts around zero, because -b+1, ..., -1, 0, 1, ..., b-1 all go to zero when dividing by b. That's 2b-1 numbers. For every other integer k, there are only b numbers (namely bk, bk+1, ..., bk+b-1).

I have never seen a case where truncation was the right thing to do. (When dealing with integers. Floats are different, of course, but they are not what this is about.)

cygx2y ago

A common approach (e.g. Cobol, Ada, Common Lisp, Haskell, Clojure, MATLAB, Julia, Kotlin) seems to be to provide two operators: One that uses truncated division, one that uses floored division. By convention, rem truncates, and mod floors.

1 more reply

pwdisswordfishc2y ago

> I have never seen a case where truncation was the right thing to do.

Splitting a quantity into units of differing orders of magnitude. For example, −144 minutes is −2 hours and −24 minutes, not −3 hours and 36 minutes. This is about the only case I know of, though.

2 more replies

ncruces2y ago

Very much this.

lifthrasiir2y ago

Quantify "well known". Historically enough variation existed in this area [1], and C only happened to copy FORTRAN's behavior for the sake of compatibility.

[1] https://en.wikipedia.org/wiki/Modulo#In_programming_language...

BlueTemplar2y ago

Python does follow the convention, but what I am wondering now is why did FORTRAN break it ?

2 more replies

ReleaseCandidat2y ago

> Changing well known behavior for something no one is really going to need.

On the contrary I can't imagine when and why anybody would want truncation. That's just a side effect of the used algorithm and not something that actually makes much (any?) sense.

kragen2y ago

the post gives two examples:

- given a number t of seconds after the epoch, what time of day does it represent? using python's definition, (t + tz) % 86400

- given an offset d between two pixels in a pixel buffer organized into sequential lines, what is the x component of the offset? using python's definition, d % width is right when the answer is positive, width - (d % width) when the answer is negative, so you could write d % width - d > p0x ? d % width : width - d % width. this gets more complicated with the fortran definition, not simpler

edit: the correct expression is (d % width if p0x + d % width < width else d % width - width) or in c-like syntax (p0x + d % width < width ? d % width : d % width - width). see http://canonical.org/~kragen/sw/dev3/modpix.py

pwdisswordfishc2y ago

> the relationship with modulo doesn't need to hold for negative numbers.

It especially needs to hold, given how often overlooked negative numbers are when reasoning about programs.

taeric2y ago· 6 in thread

I confess I have grown increasingly in favor of how common lisp does this. The basic `/` creates rationals. And "floor" is a multiple value return where the first value is the floor and the second is the remainder.

Granted, getting used to multiple value calls takes getting used to. I think I like it, but I also think I emphatically did not like it initially.

shawn_w2y ago

Scheme in the form of R7RS does something similar, though since failure to capture all returned values is an error (unlike in Common Lisp), there's more functions involved. It also offers a choice between truncation and floor behavior, so there's `truncate-quotient`, `truncate-remainder`, `floor-` versions and for both values `truncate/` and `floor/`.

lalaithion2y ago

To be fair, Python's `/` creates floats, and this article is about a second operator `//`.

taeric2y ago

Continuing in fairness, lisp is a bit more involved, here. If you do `(/ 1.0 3)`, you do not get a rational. Similarly, any division that is not integer/rational there will get treated mostly as expected.

Basically, it seems as soon as you introduce a floating point number, it stays there. Which is roughly what I would expect.

im3w1l2y ago

In python 2, '/' creates integers, and the article was written in 2010 when python 2 was the most used by far.

1 more reply

mik19982y ago

It's nice for general numerical methods but somewhat annoying for implementing integer-arithmetic algorithms.

taeric2y ago

Yeah, I think this is what I really didn't like about it at the start. Learning that you probably want `floor` if you want things to be integer took more time to learn than feels natural.

Aardwolf2y ago· 5 in thread

They made the correct decision here!

Too bad they made the wrong decision in not supporting the expected floating point behavior of division through zero (python throwing errors, rather than return inf or nan)

kstrauser2y ago

I am soooo glad that Python throws errors there. Instead of being surprised when your math starts giving unexpected results, it blows up and makes you deal with the error. There's no situation where I'd prefer silent incorrectness.

rightbyte2y ago

Inf is not incorrect.

Division by the two almost zero numbers will also give "unexpected" results, if inf is unexpected.

2 more replies

kragen2y ago

in https://stackoverflow.com/questions/78064239/does-python-not... it seems like ieee 754 does permit throwing errors on division by zero, and although it isn't the default behavior on any hardware i've used, the name of unix's division-by-zero exception signal strongly suggests that it's also the default behavior on the (non-ieee-754-compliant) pdp-11. https://stackoverflow.com/questions/12954193/why-does-divisi... makes the opposite assertion, though that ieee 754 does not permit division by zero traps. i'm not sure what to believe

people commonly opt out of python's behavior in this case by using numpy

python's decision is maybe suboptimal for efficient compilation, but it has a lot of decisions like that

edit: downthread https://news.ycombinator.com/item?id=39540416 pclmulqdq clarifies that in fact trapping on division by zero is not only ieee-754-compliant but (unlike flooring integer division!) efficiently implementable on common high-performance architectures

pclmulqdq2y ago

IEEE 754 has an infinity, so division by zero isn't the catastrophic thing that it is in integer. However, division by zero is still an exception as defined by the 754 standard.

What hardware does with these exceptions is a separate question, though. Some CPUs will swallow them for performance.

2 more replies

cygx2y ago

They made the correct decision here!

No love for Euclidean division?

OscarCunningham2y ago· 5 in thread

Yes that makes perfect sense. So why is int(-1.5) == -1? It should be -2.

Skeime2y ago

There is math.floor for rounding towards negative infinity, which has the advantage of being crystal clear.

mmcnickle2y ago

The same reason int(-1.999) is -1; The operation is different to integer division. I think of it as taking the "integer" part of the float.

Izkata2y ago

As the article mentions, this is called truncation. It truncates (cuts off) the value at the decimal point.

consp2y ago

because it's a float and not an int, it's strictly talking about integers here, not floats. All int conversions of floats are converted by discarding the remainder afaik but I could be wrong here (e.g. int(1.9) == 1, and int(-1.9) == -1)

edit: -3//2 == -2 for instance, since it's strictly integer division

cygx2y ago

It's a valid criticism: By the principle of least surprise, one should strive for a//b = int(a/b).

Basically, there's no free lunch. Personally, I prefer truncating integer division in combination with a pair of remainder operators.

1 more reply

pansa22y ago· 4 in thread

Note the top comment by “ark” - there’s really no perfect solution here.

In the floating-point case, you have to choose between negative remainders or potentially inexact results. And you definitely want integer division to work the same as float division.

orlp2y ago

Python's floating point flooring division operator also has a pitfall: it gives a correctly rounded result as if you computed (a / b) with infinite precision before flooring. This can lead to the following:

    >>> 1 / 0.1
    10.0
    >>> 1 // 0.1
    9.0

This is because 0.1 is in actuality the floating point value value 0.1000000000000000055511151231257827021181583404541015625, and thus 1 divided by it is ever so slightly smaller than 10. Nevertheless, fpround(1 / fpround(1 / 10)) = 10 exactly.

I found out about this recently because in Polars I defined a // b for floats to be (a / b).floor(), which does return 10 for this computation. Since Python's correctly-rounded division is rather expensive, I chose to stick to this (more context: https://github.com/pola-rs/polars/issues/14596#issuecomment-...).

1 more reply

pclmulqdq2y ago

In an abstract sense, floating-point division is a fundamentally different operation than integer division. There is not generally expected to be a remainder fron floating-point division aside from 0.5 ULP that will be rounded away.

If you use it without considering rounding modes carefully and then round to integer, you can get some funny results.

Skeime2y ago

I mean "inexact results" is essentially float's life motto.

That said, I don't really see why you would necessarily want float and integer division to behave the same. They're completely different types used for completely different things. Pick the one that is appropriate for your use case.

(It seems like abs(a % b) <= abs(b / 2) might be the right choice for floats which is pretty clearly not what you want for integers. I also just learned that integer division // can be applied to floats in Python, but the result is not an integer, for some reason?)

kragen2y ago

to people who use floating-point math seriously, it's very important for floating-point results to be predictably inexact; if they aren't, floating point is at best useless and usually harmful

i also didn't know python supported // on floats

2 more replies

aimor2y ago· 3 in thread

What do you do in Python when you want different behavior? Sometimes it's desirable to fix, floor, ceil, or round depending on the situation.

orlp2y ago

Assuming a, b are integers, the following answers are exact:

    def div_floor(a, b):
        return a // b

    def div_ceil(a, b):
        return (a + b - 1) // b

    def div_trunc(a, b):
        return a // b if (a < 0) == (b < 0) else -(-a // b)

    def div_round(a, b):
        return (2*a + b) // (2*b)

eugenekolo2y ago

`int(a/b)` instead of `a//b` (for trunc to zero behavior. especially important when working with c or disassembled code)

wayvey2y ago

You convert to float, divide and then do floor, ceil or whatever.

notso4112y ago· 1 in thread

How many integer 2s go in to 7?

3 not 4..

Not a hard question to answer.

Eiim2y ago

Did you read the article? It's about what happens in negative numbers. Of course everyone agrees that 7/2=3 in integer division, but -7/2 is less obvious.

1 more reply

dang2y ago

Discussed at the time:

Why Python's Integer Division Floors - https://news.ycombinator.com/item?id=1630394 - Aug 2010 (2 comments)

DonHopkins2y ago

Forth-83 beat C99 by 16 years!

>Python, unlike C, has the mod operator always return a positive number (twitter.com/id_aa_carmack):

https://news.ycombinator.com/item?id=29729890

https://twitter.com/ID_AA_Carmack/status/1476294133975240712

tzs on Dec 30, 2021 | next [–]

>The submission is a tweet which doesn't really have a title, so the submitter was forced to make up one. Unfortunately the assertion in the chosen title is not correct. In Python the mod operator returns a number with the same sign as the second argument:

  >>> 10%3
  1
  >>> (-10)%3
  2
  >>> 10%(-3)
  -2
  >>> (-10)%(-3)
  -1

https://news.ycombinator.com/item?id=29732335

DonHopkins on Dec 30, 2021 | parent | context | favorite | on: Python, unlike C, has the mod operator always retu...

The FORTH-83 standard adopted floored division.

Signed Integer Division, by Robert L. Smith. Originally appearing in Dr. Dobb's Journal September 1983:

https://wiki.forth-ev.de/doku.php/projects:signed_integer_di...

Lots of other languages got it wrong:

https://en.wikipedia.org/wiki/Modulo_operation#In_programmin...

Symmetric division is the kind of thing that causes rockets ships to explode unexpectedly.

Symmetric division considered harmful:

https://www.nimblemachines.com/symmetric-division-considered...

>Since its 1983 standard (Forth-83), Forth has implemented floored division as standard. Interestingly, almost all processor architectures natively implement symmetric division.

>What is the difference between the two types? In floored division, the quotient is truncated toward minus infinity (remember, this is integer division we’re talking about). In symmetric division, the quotient is truncated toward zero, which means that depending on the sign of the dividend, the quotient can be truncated in different directions. This is the source of its evil.

>I’ve thought about this a lot and have come to the conclusion that symmetric division should be considered harmful.

>There are two reasons that I think this: symmetric division yields results different from arithmetic right shifts (which floor), and both the quotient and remainder have singularities around zero.

>If you’re interested in the (gory) details, read on. [...]

vsuperpower20202y ago

So no good reasons, just mathematical ones.

j / k navigate · click thread line to collapse

98 comments

55 comments · 11 top-level

nwellnhof2y ago· 11 in thread

Karellen2y ago

marko_oktabyr2y ago

> That's not how modulo arithmetic works.

1 more reply

seanhunter2y ago

C generally does return negative numbers when you use the % operator.

kazinator2y ago

Under modulo math, -1, 9 and 19, for example, are all the same element of the modulo 10 congruence. 9 is called the "least positive residue", which is sometimes important.

matsemann2y ago

I have to re-learn this each year for Advent of Code, when inevitable my something % anything goes into the negative when searching downwards.

marcosdumay2y ago

That's not any clearer. Both "modulo" and "remainder" can be defined in ways (yeah, more than one) that include or exclude negative numbers.

You have to memorize the full definition. There's no mnemonic shortcut.

1 more reply

pwdisswordfishc2y ago

Exactly. Modulo is not an operator, but a term for an equivalence class of differing by a multiple of a number fixed in advance.

4 more replies

Findecanor2y ago

kazinator2y ago

However, the oposite shift is undefined if a 1 goes into the sign bit.

More precisely, regarding E1 << E2, it is written (in the April 2023 ISO C draft):

"If E1 has a signed type and nonnegative value, and E1 × 2ᴱ² is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined."

Thus if E1 is negative, or if the result overflows, UB.

1 more reply

kragen2y ago

mentioned in the comments on the post, which merit reading in this case

DonHopkins2y ago

Who would have ever thought that division could be so ... divisive?

1 more reply

z_open2y ago· 9 in thread

Skeime2y ago

I have never seen a case where truncation was the right thing to do. (When dealing with integers. Floats are different, of course, but they are not what this is about.)

cygx2y ago

1 more reply

pwdisswordfishc2y ago

> I have never seen a case where truncation was the right thing to do.

2 more replies

ncruces2y ago

Very much this.

lifthrasiir2y ago

Quantify "well known". Historically enough variation existed in this area [1], and C only happened to copy FORTRAN's behavior for the sake of compatibility.

[1] https://en.wikipedia.org/wiki/Modulo#In_programming_language...

BlueTemplar2y ago

Python does follow the convention, but what I am wondering now is why did FORTRAN break it ?

2 more replies

ReleaseCandidat2y ago

> Changing well known behavior for something no one is really going to need.

On the contrary I can't imagine when and why anybody would want truncation. That's just a side effect of the used algorithm and not something that actually makes much (any?) sense.

kragen2y ago

the post gives two examples:

- given a number t of seconds after the epoch, what time of day does it represent? using python's definition, (t + tz) % 86400

pwdisswordfishc2y ago

> the relationship with modulo doesn't need to hold for negative numbers.

It especially needs to hold, given how often overlooked negative numbers are when reasoning about programs.

taeric2y ago· 6 in thread

Granted, getting used to multiple value calls takes getting used to. I think I like it, but I also think I emphatically did not like it initially.

shawn_w2y ago

lalaithion2y ago

To be fair, Python's `/` creates floats, and this article is about a second operator `//`.

taeric2y ago

Basically, it seems as soon as you introduce a floating point number, it stays there. Which is roughly what I would expect.

im3w1l2y ago

In python 2, '/' creates integers, and the article was written in 2010 when python 2 was the most used by far.

1 more reply

mik19982y ago

It's nice for general numerical methods but somewhat annoying for implementing integer-arithmetic algorithms.

taeric2y ago

Yeah, I think this is what I really didn't like about it at the start. Learning that you probably want `floor` if you want things to be integer took more time to learn than feels natural.

Aardwolf2y ago· 5 in thread

They made the correct decision here!

Too bad they made the wrong decision in not supporting the expected floating point behavior of division through zero (python throwing errors, rather than return inf or nan)

kstrauser2y ago

rightbyte2y ago

Inf is not incorrect.

Division by the two almost zero numbers will also give "unexpected" results, if inf is unexpected.

2 more replies

kragen2y ago

people commonly opt out of python's behavior in this case by using numpy

python's decision is maybe suboptimal for efficient compilation, but it has a lot of decisions like that

pclmulqdq2y ago

IEEE 754 has an infinity, so division by zero isn't the catastrophic thing that it is in integer. However, division by zero is still an exception as defined by the 754 standard.

What hardware does with these exceptions is a separate question, though. Some CPUs will swallow them for performance.

2 more replies

cygx2y ago

They made the correct decision here!

No love for Euclidean division?

OscarCunningham2y ago· 5 in thread

Yes that makes perfect sense. So why is int(-1.5) == -1? It should be -2.

Skeime2y ago

There is math.floor for rounding towards negative infinity, which has the advantage of being crystal clear.

mmcnickle2y ago

The same reason int(-1.999) is -1; The operation is different to integer division. I think of it as taking the "integer" part of the float.

Izkata2y ago

As the article mentions, this is called truncation. It truncates (cuts off) the value at the decimal point.

consp2y ago

edit: -3//2 == -2 for instance, since it's strictly integer division

cygx2y ago

It's a valid criticism: By the principle of least surprise, one should strive for a//b = int(a/b).

Basically, there's no free lunch. Personally, I prefer truncating integer division in combination with a pair of remainder operators.

1 more reply

pansa22y ago· 4 in thread

Note the top comment by “ark” - there’s really no perfect solution here.

In the floating-point case, you have to choose between negative remainders or potentially inexact results. And you definitely want integer division to work the same as float division.

orlp2y ago

    >>> 1 / 0.1
    10.0
    >>> 1 // 0.1
    9.0

1 more reply

pclmulqdq2y ago

If you use it without considering rounding modes carefully and then round to integer, you can get some funny results.

Skeime2y ago

I mean "inexact results" is essentially float's life motto.

kragen2y ago

to people who use floating-point math seriously, it's very important for floating-point results to be predictably inexact; if they aren't, floating point is at best useless and usually harmful

i also didn't know python supported // on floats

2 more replies

aimor2y ago· 3 in thread

What do you do in Python when you want different behavior? Sometimes it's desirable to fix, floor, ceil, or round depending on the situation.

orlp2y ago

Assuming a, b are integers, the following answers are exact:

    def div_floor(a, b):
        return a // b

    def div_ceil(a, b):
        return (a + b - 1) // b

    def div_trunc(a, b):
        return a // b if (a < 0) == (b < 0) else -(-a // b)

    def div_round(a, b):
        return (2*a + b) // (2*b)

eugenekolo2y ago

`int(a/b)` instead of `a//b` (for trunc to zero behavior. especially important when working with c or disassembled code)

wayvey2y ago

You convert to float, divide and then do floor, ceil or whatever.

notso4112y ago· 1 in thread

How many integer 2s go in to 7?

3 not 4..

Not a hard question to answer.

Eiim2y ago

Did you read the article? It's about what happens in negative numbers. Of course everyone agrees that 7/2=3 in integer division, but -7/2 is less obvious.

1 more reply

dang2y ago

Discussed at the time:

Why Python's Integer Division Floors - https://news.ycombinator.com/item?id=1630394 - Aug 2010 (2 comments)

DonHopkins2y ago

Forth-83 beat C99 by 16 years!

>Python, unlike C, has the mod operator always return a positive number (twitter.com/id_aa_carmack):

https://news.ycombinator.com/item?id=29729890

https://twitter.com/ID_AA_Carmack/status/1476294133975240712

tzs on Dec 30, 2021 | next [–]

  >>> 10%3
  1
  >>> (-10)%3
  2
  >>> 10%(-3)
  -2
  >>> (-10)%(-3)
  -1

https://news.ycombinator.com/item?id=29732335

DonHopkins on Dec 30, 2021 | parent | context | favorite | on: Python, unlike C, has the mod operator always retu...

The FORTH-83 standard adopted floored division.

Signed Integer Division, by Robert L. Smith. Originally appearing in Dr. Dobb's Journal September 1983:

https://wiki.forth-ev.de/doku.php/projects:signed_integer_di...

Lots of other languages got it wrong:

https://en.wikipedia.org/wiki/Modulo_operation#In_programmin...

Symmetric division is the kind of thing that causes rockets ships to explode unexpectedly.

Symmetric division considered harmful:

https://www.nimblemachines.com/symmetric-division-considered...

>Since its 1983 standard (Forth-83), Forth has implemented floored division as standard. Interestingly, almost all processor architectures natively implement symmetric division.

>I’ve thought about this a lot and have come to the conclusion that symmetric division should be considered harmful.

>There are two reasons that I think this: symmetric division yields results different from arithmetic right shifts (which floor), and both the quotient and remainder have singularities around zero.

>If you’re interested in the (gory) details, read on. [...]

vsuperpower20202y ago

So no good reasons, just mathematical ones.

j / k navigate · click thread line to collapse