Too much discussion of the XOR swap trick (opens in new tab)

(heather.cafe)

152 pointsCJefferson27d ago115 comments

115 comments

The fun bit is right at the start when the author notices that the compiler spots this and optimizes it away.

We didn't get into the deeper question of benchmarking it vs. a three-register swap, because I suspect the latter would be handled entirely by register renaming and end up being faster due to not requiring allocation of an ALU unit. Difficult to benchmark that because in order for it to make a difference, you'd need to surround it with other arithmetic instructions.

A meta question is why this persists. It has the right qualities for a "party trick": slightly esoteric piece of knowledge, not actually that hard to understand when you do know about it, but unintuitive enough that most people don't spontaneously reinvent it.

See also: https://en.wikipedia.org/wiki/Fast_inverse_square_root , which requires a bit more maths.

The other classic use of XOR - cursor overdrawing - has also long since gone away. It used to be possible to easily draw a cursor on a monochrome display by XORing it in, then to move it you simply XOR it again, restoring the original image.

chrisss39524d ago

Reminds me of the time developing the 747-8 avionics and the systems engineers started observing a bug where the entire box would just stop randomly, all output stopped, including HW traces...the thing would just halt.

About a month later an engineer decided to turn on all warnings for gcc...and behold a message stating something to the effect of "WARNING: execution will halt upon reaching this statement." The compiled code basically just halted and never returned from the function call (can't remember the specifics).

And that is how we learned not to hide compiler messages.

warmwaffles24d ago

`-Wall` is your friend and boy it is a major irritant using third party sloppy libraries when statically compiling it all.

yjftsjthsd-h24d ago

There are even more options you might also want to consider: https://stackoverflow.com/questions/73310403/whats-the-diffe...

1 more reply

fwip24d ago

I'm actually a little horrified to learn that warnings were suppressed while developing code to control a 747.

2 more replies

m46323d ago

This is one of those things that is hard to do, but great to have done. It's hard to do with a codebase in flight with lots of people working on it.

What I remember implementing this on projects was the messiness of:

- incrementally getting the Makefiles to turn on -Wall file-by-file as they were scrubbed. I think it was something similar to "<list-of-files>: CFLAGS+=-Wall" and then add to the list.

- suppressing warnings that were "ok" on a case-by-case basis. different languages had different ways of saying "ignore error 123 here" if at all.

- I remember lint had things like this too, like /NOTREACHED/

maybe things have gotten better/cleaner.

inetknght24d ago

Better to turn every warning into an error.

1 more reply

vidarh24d ago

The cursor overdrawing trick in part started going away before it's time thanks to patent enforcement (one of the lawsuits infamously exacerbated Commodores financial woes towards the end)... By the time the patent expired there was really no longer much value in going back to it.

zahlman24d ago

> A meta question is why this persists. It has the right qualities for a "party trick"

This is discussed in a section near the end. But I feel like the discussion of why people care about using XOR to swap two values is missing an underlying discussion of why people would care about swapping values. As shown earlier in the division example, at the point where you would do the swap, typically you can just write the rest of the code with the roles of the values swapped.

saulpw24d ago

Usually it's used like:

   if max < min:
       min, max = max, min

   [... algorithm that requires min < max]

So your suggestion would be to have two versions of the algorithm in the two branches of the 'if'. This is significantly more complicated and may even be slower depending on lots of factors.

addaon24d ago

> typically you can just write the rest of the code with the roles of the values swapped

Today, yes; most modern instruction sets are pretty orthogonal, and you can use a value in any register symmetrically -- although even today, division instructions (if they exist!) are among the most likely to violate that expectation, along with instructions that work with the stack pointer. But in the XOR heyday, this was less true -- instruction sets were less orthogonal, and registers were more scarced. It's not unreasonable for an OS scheduler tick to do some work to figure out the newly-scheduled task's stack pointer in one register, and need to swap it into an SPR or similar so that the return from interrupt returns to the new location, for example; and this is the exact type of place where the XOR trick occasionally has value.

jcalvinowens24d ago

The "party trick" is more general than just xor, for example:

    void swap(unsigned* a, unsigned* b)
    {
        *a = *a + *b;
        *b = *a - *b;
        *a = *a - *b;
    }

GCC still sees through it with restrict: https://godbolt.org/z/8n3bGha3e

warmwaffles24d ago

Yea this is operating on the pointer to the values, but if you use just an `unsigned a` it uses xor

clang: https://godbolt.org/z/5PEMTrxba

gcc: https://godbolt.org/z/7j8h4zo4v

zahlman24d ago

This is covered in TFA.

sifar24d ago

This has overflow issues unlike xor.

deathanatos24d ago

Depending on what you think the "issue" is, one of:

* wrapping is well-defined behavior for unsigned integers; signed integer wrapping is UB, but is not used here.

* the equations (that a & b cancel each other out, resulting in the swap) hold even when done in a mod N ring.

jimjag24d ago

You can even do it with complex images in a multi-color bitmap. I do it with my 6502 SBC setup: https://github.com/jimjag/JJ65c02/blob/main/Software/pico-co...

mizmar24d ago

Another similar trick - XOR doubly linked list: XOR the prev and next pointers (storaged size of node decreases) and you can recover the values when accessing the node from prev or next side and thus already know one of the addresses.

zahlman24d ago

This, too, is covered in TFA.

kbelder24d ago

>The other classic use of XOR - cursor overdrawing - has also long since gone away.

But that's a specific example of a general use that still is handy occasionally. A = B XOR C. Come back later with C and A and retrieve B.

mizmar24d ago

the "XOR in hash functions" mentioned at the end most certainly refers to Zobrist hashing, the actually useful XOR trick.

Requires fixed-length keys. Generate random table for each byte position. Then to hash a key, for each position lookup byte in table and XOR the results. This is used in chess/go/shogi/... engines because the board/position representation is fixed-length and you can undo modes easily - XOR-out byte from previous state and XOR-in from new state.

ahoka24d ago

When I opened the article, I have immediately searched for "register renaming"...

m46323d ago

> register renaming

is register renaming something done in hardware, or by the compiler?

denotational23d ago

Hardware, see https://en.wikipedia.org/wiki/Tomasulo%27s_algorithm.

danhau24d ago

> Are there other XOR tricks?

Yes, error correction.

You have some packets of data a, b, c. Add one additional packet z that is computed as z = a ^ b ^ c. Now whenever one of a, b or c gets corrupted or lost, it can be reconstructed by computing the XOR of all the others.

So if b is lost: b = a ^ c ^ z. This works for any packet, but only one. If multiple are lost, this will fail.

There are way better error correction algorithms, but I like the simplicity of this one.

throwaway202724d ago

https://en.wikipedia.org/wiki/XOR_linked_list

amelius24d ago

XOR is also great for storing copyrighted works without liability.

a = the bits of some song or movie

b = pure noise

Store c = a^b.

Give b to a friend. Throw away a.

Now both you and your friend have a bit vector of pure noise. Together you can produce the copyrighted work. But nobody is liable.

aleph_minus_one24d ago

This is the "What Colour are your bits?" argument:

> https://ansuz.sooke.bc.ca/entry/23

> https://ansuz.sooke.bc.ca/entry/24

As these article outline, the legal situation is much more complicated (and unintuitive to people who are used to "computer science thinking").

amelius24d ago

This is because legal people want something to exist that does not physically exist.

There is no trace back from pure noise to the original work.

Colour of bits is just magical thinking.

3 more replies

pjc5024d ago

Whether people are liable is a question for the courts, and I suspect they simply look through the tech and ask "do you end up with a copy of the work?"

(unless you're an AI company, in which case you can copy the whole internet just fine)

amelius24d ago

But that's an unsolvable question. Just like when A is caught on camera stealing a diamond, but A turns out to have an identical twin B. So the prosecutor can't do anything.

2 more replies

lelanthran24d ago

That's called encryption using a one time pad.

amelius24d ago

That's one way of looking at it. Note that you can easily extend it to multiple people holding multiple keys, so that a=b^c^d^e, etc.

1 more reply

nkrisc24d ago

Crimes aren’t legal just because you hide it. Intent matters.

Law is more akin to philosophy than computer science.

meindnoch24d ago

Pretty sure the law doesn't care about this XOR trick.

ChrisRR24d ago

What if I listen to pure noise anyway?

sph23d ago

You have invented one-time pads.

JTbane24d ago

I mean, this is just a one-time pad and I'm sure a court could compel your friend to give up the data.

amelius23d ago

What exactly does it prove if my friend gives up their bit vector?

xorl24d ago

hah! Love it.

direwolf2024d ago

Important note: Only if you already know which one was corrupted.

danhau21d ago

True! Some checksums / parity bits might be needed.

nsteel24d ago

Also to cheaply (area) create multi-port RAMs.

pjc5024d ago

How does that work?

nsteel24d ago

It's similar to RAID schemes but instead of drive failure it's port unavailability. There's a reference at [1] or an FPGA-centric one at [2], but it applies to anywhere where dual/single-port rams are readily available but anything more exotic isn't.

  [1] Achieving Multi-Port Memory Performance on Single-Port Memory with Coding Techniques - https://arxiv.org/abs/2001.09599
  [2] https://people.csail.mit.edu/ml/pubs/fpga12_xor.pdf

Terr_24d ago

See also: RAID levels that use one disk for parity. Three disks is simplest, but technically you can do more if you trust that only one will go bad at a time.

A few months ago, I had a rare occasion of trying to explain them to a relative who had just bought a fancy NAS and wanted help setting it up.

gobdovan24d ago

The XOR trick is only cool in its undefined-behavior form:

a^=b^=a^=b;

Which allegedly saves you 0.5 seconds of typing in competitive programming competitions from 20 years ago and is known to work reliably (on MinGW under Windows XP).

Bonus authenticity: use `a^=a` to zero a register in a single x86 instruction (and makes a real difference for compiler toolchains 30+ years old).

For real now, a very useful application of XOR is its relation to the Nim game [0], which comes in very handy if you need to save your village from an ancient disgruntled Chinese emperor.

[0] https://en.wikipedia.org/wiki/Nim

MathMonkeyMan24d ago

`xor eax, eax` is less code when assembled. That's why compilers generate it.

gobdovan24d ago

..and reliably generate it for decades, but you can still manually do it in C for bonus stylistic points!

ginko24d ago

>Bonus authenticity: use `a^=a` to zero a register in a single x86 instruction (and makes a real difference for compiler toolchains 30+ years old).

Modern compilers will still use xor for zeroing registers on their own.

For instance: https://godbolt.org/z/n5n35xjqx

Variable a(register esi) is first initialized to 42 with mov and then cleared to zero using xor.

gobdovan24d ago

Yes, the whole comment is tongue in cheek for pointless manual optimizations.

leni53624d ago

> The XOR trick is only cool in its undefined-behavior form:

> a^=b^=a^=b;

I believe this is defined in C++ since C++17, but still undefined in C.

mmozeiko24d ago

xor swap trick was useful in older simd (sse1/sse2) when based on some condition you want to swap values or not:

  tmp = (a ^ b) & mask
  a ^= tmp
  b ^= tmp

If mask = 0xfff...fff then a/b will be swapped, otherwise if mask = 0 then they'll remain the same.

CJeffersonOP24d ago

Oh, that is cool, I’ve never seen that. I might add that to an extended version of the post sometime, I’ll be sure to credit you.

jagged-chisel24d ago

So mask is marking the bits you want swapped and leaving the others in place.

koolala24d ago

Could this still be the ideal way for vectors of Ints in WebGL2?

jesse__24d ago

That's hella cute

fuglede_24d ago

The XOR swap trick also features in the compilation/synthesis of quantum algorithms, where the XOR instruction (in the form of a CNOT gate) is fundamental in many architectures, and where native swapping need not be available.

One extension that I ran into, and which I think forms a nice problem is the following:

Just like the XOR swap trick can be used to swap to variables (and let's just say that they're bools), it can be extended to implement any permutation of the variables: suppose that the permutation is written as a composition of n transpositions (i.e., swaps of pairs), and that is the minimal number of transpositions that let's you do that. Each transposition can be implemented by 3 XORs, by the XOR swap trick for pairs, and so the full permutation can be implemented by 3n XORs. Now here's the question: Is it possible to come up with a way of doing it with less than 3n, or can we find a permutation that has a shortcut through XOR-land (not allowing any other kinds of instructions)? In other words, is XOR-swapping XOR-optimal?

I'm not going to spoil it, but only last year a paper was published in the quantum information literature that contains an answer [0]. I ended up making a little game where you get to play around with XOR-optimizing not only permutations, but general linear reversible circuits. [1]

[0] https://link.springer.com/article/10.1007/s11128-025-04831-5

[1] https://swapple.fuglede.dk

JKCalhoun24d ago

My assumption has been that the XOR trick was closer to a thought experiment or hypothetical. That is to say, no one would ever use it in practice, but by asking an interviewee to walk through it shows they have a low-level understanding of how binary numbers work, bit-masking.

For me it falls in the obfuscated-C quadrant for code. Performance implications aside, it's just not the kind of "self-documenting" code I like lying around in my sources. (And I'll take clarity of purpose over performance every day.)

praptak24d ago

Xor swap trick has perfect profile for underhanded C contests. It generally works until a specific condition triggers its failure. The condition is "the arguments are aliases", so for example XOR_SWAP(a[i], a[j]) when i=j.

imploded_sub24d ago

TFA mentions this as an issue for the trick, whereas you rightly point out it's an evil feature instead.

stinkbeetle24d ago

You can use this same property of xor to make a double-linked list using just one pointer per item, which is xor of the previous and next item addresses!

Panzerschrek24d ago

Except such "optimization" may affect performance on CPUs with speculative execution/prediction. But it should be measured.

stinkbeetle23d ago

Well it is a memory optimization where the interest would be to reduce cache misses or memory consumption. I have never felt the need to use such a thing seriously but I find it a neat trick.

It can only be used when walking the list of course, which is quite likely why it is not more widely used -- it does not provide the benefit that a regular double linked list is mostly used for.

gblargg24d ago

Presumably the reason is to reduce memory overhead for nodes.

HarHarVeryFunny24d ago

> Are there other XOR tricks?

I'm not sure I'd call it a "trick", but since A ^ 0 = A, and B ^ B = 0, then ((A ^ B) ^ B) = A. i.e. XOR-ing any number by the same number twice gets you back the original number.

This used to be used back in the day for cheap and nasty computer graphics, since it means that if you draw to the screen by XOR-ing with the pixels already on the screen then you can undo it, restoring the background, by doing it a second time. The "nasty" part is that XORing with what's already on the screen isn't going to look great, but for something like a rotating wire-frame figure it might be OK.

pcherna24d ago

On the Amiga, accessing the pulldown menus caused the display to be temporarily frozen, so that the menu panel could be rendered off screen and then swapped in using the blitter and the XOR trick. It wasn’t as fast as a straight copy, but it saved on precious memory, and it rendered very cleanly.

When you released to the menu button on the mouse, it did a similar swap to restore the screen contents. In version 2.0 I optimized it to do a simple copy of the offscreen rectangle back onto the screen because before it was wasting time in order to preserve the menu pixels that we’re going to be thrown away immediately.

DeathArrow24d ago

30 years ago, when I wanted to 0 a register in assembly I used something like xor ah, ah because it was a bit more performant.

icelusxl24d ago

It still is. The CPU's register renamer can detect these instructions to not have data dependencies and can zero the register itself. It doesn't send the instruction to the execution engine meaning they use no execution resources and have zero latency.

pjc5024d ago

IIRC that's because that instruction is one byte while a "load immediate" would have to express 0 as one or more bytes.

(See also the wacky way in which ARM "load immediate" works)

IIAOPSW24d ago

I'll throw my hat in the ring of other "xor" tricks.

So we all know addition swap. One generalization that comes to mind is doing some other in-place transform on the two input variables. Lets keep it simple and suppose that its a linear transform. Thus the problem is to apply some matrix [[a,b],[c,d]] to two input variables [x,y] using entirely in-place operations.

We can do this by realizing that our basic operands can be expressed as matrices. x += ky is the same as the matrix [1, k] [0, 1]

likewise y += k

x is equiv to the lower triangular matrix [1, 0] [k, 1]

and lastly, the = operator is equiv to a matrix with an element on the diag. x = k [k ,0] [0, 1]

y *= k [1, 0] [0, k]

From this point on it becomes a challenge of if you can construct any desired matrix into some combination of these available ones (spoiler, yes you can).

The next generalization one could contemplates is doing operations in place on more than 2 variables. Well, if one has already solved arbitrary 2x2 matrix operations, then that can be rigged to implement larger matrices one submatrix at a time.

The final generalization that comes to mind is what can we do with non-arithmetic operators? We've already seen an example of this with using xor-swap rather than addition-swap. But is there anything out there vaguely like xor-2x2-matrix-multiply?

I legit don't know. I have some thought, but I won't meander out loud if its not going to lead anywhere.

ranger_danger24d ago

> given a list where every value appears exactly twice except one, XOR all the values together and the duplicates cancel out, leaving the unique element

For some reason this reminds me of the Fourier transform. I wonder if it can be performed with XOR tricks and no complicated arithmetic?

fluoridation24d ago

There's a more general formulation, which is that every value but one must appear even numbers of times, and the one must appear some odd number of times.

twiceaday24d ago

How about every value appears 0 mod n times except one which appears 1 mod n times? :)

Solution: xor is just addition mod 2. Write the numbers in base n and do digit-wise addition mod n (ie without carry). Very intuitive way to see the xor trick.

anematode23d ago

It's not quite the same as XOR swap, but a trick I've found handy is conditionally swapping values using XOR:

    int a, b;
    bool cond;
    
    int swap = cond ? a ^ b : 0;
    a ^= swap;
    b ^= swap;

If cond is highly unpredictable this can work rather nicely.

torginus24d ago

I would remark that modern CPUs don't use physical registers, so swapping should be just a register rename op, and this kind of bithacking only applies to old machines.

The article makes the same point as well at the end:

It is the kind of technique which might have been occasionally useful in the 1980s, but now is only useful for cute interview questions and as a curiosity.

robinsonb524d ago

This brings back memories of Chunky-to-Planar conversion on the Amiga, where the Xor trick was combined with bitshifts and masking to rearrange the bits in 8 32-bit words in reasonable time.

fch4224d ago

xor'ing the elements of lists together also allows a test of "unordered equality" because 1^2^3^4^5 == (xor of any permutation of [1,2,3,4,5]).

Yes there are false positives, and the false negative of all-zero/all-equal, but the test can be useful in a "bloom filter" type case.

Have used it in dynamic firewalling rules ... one can do something pretty close to a JA3/JA4 TLS fingerprint in eBPF with that (to match the cipher lists).

IIAOPSW24d ago

You could also do your weak equality test by taking the sum 1+2+3+4+5, or the product 12345, or any other commutative binary operator.

pjmlp24d ago

Another trick when SIMD is not available, is pseudo SIMD via SWAR, one example,

https://lemire.me/blog/2022/01/21/swar-explained-parsing-eig...

YasuoTanaka24d ago

Used to do this in assembly back when registers actually mattered. Today it mostly hurts readability more than anything.

Panzerschrek24d ago

Such tricks were maybe useful 40 years ago while writing assembly code manually or while using a dumb compiler with no optimizations. But nowadays such tricks are near to useless. All useful ones (like optimizing division by 2 via bitshift ) are already implemented as compiler optimizations. Others shouldn't be used in order to avoid making optimizer's job harder.

Tade024d ago

Back when I was in college in the late 00s we were advised to not attempt to optimise using assembly unless we really found a bottleneck the compiler missed.

pjc5024d ago

Correct. There are still some situations that benefit from it, but only using the extended instruction sets that compilers can't/won't generate. Even then you should at least try writing the C code and seeing if it will auto-vectorize. Even C# can auto-vectorize some cases now.

Things like "add with saturation" and the special AES instructions.

jesse__24d ago

In my experience, relying on the compiler to auto vectorize your code is a path fraught with peril.

It'll break eventually. If it matters, write the simd yourself. It'll probably be 2-50x better than the compiler anyways.

adonovan24d ago

Another situation to avoid the XOR trick, even when registers are tight, is when swapping pointers in a garbage-collected language, since the intermediate bit patterns are invalid pointers: if a GC mark phase occurs at that moment, you might lose some objects, or spuriously mark others as live.

j / k navigate · click thread line to collapse

115 comments

pjc5024d ago

The fun bit is right at the start when the author notices that the compiler spots this and optimizes it away.

See also: https://en.wikipedia.org/wiki/Fast_inverse_square_root , which requires a bit more maths.

chrisss39524d ago

And that is how we learned not to hide compiler messages.

warmwaffles24d ago

`-Wall` is your friend and boy it is a major irritant using third party sloppy libraries when statically compiling it all.

yjftsjthsd-h24d ago

There are even more options you might also want to consider: https://stackoverflow.com/questions/73310403/whats-the-diffe...

1 more reply

fwip24d ago

I'm actually a little horrified to learn that warnings were suppressed while developing code to control a 747.

2 more replies

m46323d ago

This is one of those things that is hard to do, but great to have done. It's hard to do with a codebase in flight with lots of people working on it.

What I remember implementing this on projects was the messiness of:

- incrementally getting the Makefiles to turn on -Wall file-by-file as they were scrubbed. I think it was something similar to "<list-of-files>: CFLAGS+=-Wall" and then add to the list.

- suppressing warnings that were "ok" on a case-by-case basis. different languages had different ways of saying "ignore error 123 here" if at all.

- I remember lint had things like this too, like /NOTREACHED/

maybe things have gotten better/cleaner.

inetknght24d ago

Better to turn every warning into an error.

1 more reply

vidarh24d ago

zahlman24d ago

> A meta question is why this persists. It has the right qualities for a "party trick"

saulpw24d ago

Usually it's used like:

   if max < min:
       min, max = max, min

   [... algorithm that requires min < max]

So your suggestion would be to have two versions of the algorithm in the two branches of the 'if'. This is significantly more complicated and may even be slower depending on lots of factors.

addaon24d ago

> typically you can just write the rest of the code with the roles of the values swapped

jcalvinowens24d ago

The "party trick" is more general than just xor, for example:

    void swap(unsigned* a, unsigned* b)
    {
        *a = *a + *b;
        *b = *a - *b;
        *a = *a - *b;
    }

GCC still sees through it with restrict: https://godbolt.org/z/8n3bGha3e

warmwaffles24d ago

Yea this is operating on the pointer to the values, but if you use just an `unsigned a` it uses xor

clang: https://godbolt.org/z/5PEMTrxba

gcc: https://godbolt.org/z/7j8h4zo4v

zahlman24d ago

This is covered in TFA.

sifar24d ago

This has overflow issues unlike xor.

deathanatos24d ago

Depending on what you think the "issue" is, one of:

* wrapping is well-defined behavior for unsigned integers; signed integer wrapping is UB, but is not used here.

* the equations (that a & b cancel each other out, resulting in the swap) hold even when done in a mod N ring.

jimjag24d ago

You can even do it with complex images in a multi-color bitmap. I do it with my 6502 SBC setup: https://github.com/jimjag/JJ65c02/blob/main/Software/pico-co...

mizmar24d ago

zahlman24d ago

This, too, is covered in TFA.

kbelder24d ago

>The other classic use of XOR - cursor overdrawing - has also long since gone away.

But that's a specific example of a general use that still is handy occasionally. A = B XOR C. Come back later with C and A and retrieve B.

mizmar24d ago

the "XOR in hash functions" mentioned at the end most certainly refers to Zobrist hashing, the actually useful XOR trick.

ahoka24d ago

When I opened the article, I have immediately searched for "register renaming"...

m46323d ago

> register renaming

is register renaming something done in hardware, or by the compiler?

denotational23d ago

Hardware, see https://en.wikipedia.org/wiki/Tomasulo%27s_algorithm.

danhau24d ago

> Are there other XOR tricks?

Yes, error correction.

So if b is lost: b = a ^ c ^ z. This works for any packet, but only one. If multiple are lost, this will fail.

There are way better error correction algorithms, but I like the simplicity of this one.

throwaway202724d ago

https://en.wikipedia.org/wiki/XOR_linked_list

amelius24d ago

XOR is also great for storing copyrighted works without liability.

a = the bits of some song or movie

b = pure noise

Store c = a^b.

Give b to a friend. Throw away a.

Now both you and your friend have a bit vector of pure noise. Together you can produce the copyrighted work. But nobody is liable.

aleph_minus_one24d ago

This is the "What Colour are your bits?" argument:

> https://ansuz.sooke.bc.ca/entry/23

> https://ansuz.sooke.bc.ca/entry/24

As these article outline, the legal situation is much more complicated (and unintuitive to people who are used to "computer science thinking").

amelius24d ago

This is because legal people want something to exist that does not physically exist.

There is no trace back from pure noise to the original work.

Colour of bits is just magical thinking.

3 more replies

pjc5024d ago

Whether people are liable is a question for the courts, and I suspect they simply look through the tech and ask "do you end up with a copy of the work?"

(unless you're an AI company, in which case you can copy the whole internet just fine)

amelius24d ago

But that's an unsolvable question. Just like when A is caught on camera stealing a diamond, but A turns out to have an identical twin B. So the prosecutor can't do anything.

2 more replies

lelanthran24d ago

That's called encryption using a one time pad.

amelius24d ago

That's one way of looking at it. Note that you can easily extend it to multiple people holding multiple keys, so that a=b^c^d^e, etc.

1 more reply

nkrisc24d ago

Crimes aren’t legal just because you hide it. Intent matters.

Law is more akin to philosophy than computer science.

meindnoch24d ago

Pretty sure the law doesn't care about this XOR trick.

ChrisRR24d ago

What if I listen to pure noise anyway?

sph23d ago

You have invented one-time pads.

JTbane24d ago

I mean, this is just a one-time pad and I'm sure a court could compel your friend to give up the data.

amelius23d ago

What exactly does it prove if my friend gives up their bit vector?

xorl24d ago

hah! Love it.

direwolf2024d ago

Important note: Only if you already know which one was corrupted.

danhau21d ago

True! Some checksums / parity bits might be needed.

nsteel24d ago

Also to cheaply (area) create multi-port RAMs.

pjc5024d ago

How does that work?

nsteel24d ago

  [1] Achieving Multi-Port Memory Performance on Single-Port Memory with Coding Techniques - https://arxiv.org/abs/2001.09599
  [2] https://people.csail.mit.edu/ml/pubs/fpga12_xor.pdf

Terr_24d ago

See also: RAID levels that use one disk for parity. Three disks is simplest, but technically you can do more if you trust that only one will go bad at a time.

A few months ago, I had a rare occasion of trying to explain them to a relative who had just bought a fancy NAS and wanted help setting it up.

gobdovan24d ago

The XOR trick is only cool in its undefined-behavior form:

a^=b^=a^=b;

Which allegedly saves you 0.5 seconds of typing in competitive programming competitions from 20 years ago and is known to work reliably (on MinGW under Windows XP).

Bonus authenticity: use `a^=a` to zero a register in a single x86 instruction (and makes a real difference for compiler toolchains 30+ years old).

For real now, a very useful application of XOR is its relation to the Nim game [0], which comes in very handy if you need to save your village from an ancient disgruntled Chinese emperor.

[0] https://en.wikipedia.org/wiki/Nim

MathMonkeyMan24d ago

`xor eax, eax` is less code when assembled. That's why compilers generate it.

gobdovan24d ago

..and reliably generate it for decades, but you can still manually do it in C for bonus stylistic points!

ginko24d ago

>Bonus authenticity: use `a^=a` to zero a register in a single x86 instruction (and makes a real difference for compiler toolchains 30+ years old).

Modern compilers will still use xor for zeroing registers on their own.

For instance: https://godbolt.org/z/n5n35xjqx

Variable a(register esi) is first initialized to 42 with mov and then cleared to zero using xor.

gobdovan24d ago

Yes, the whole comment is tongue in cheek for pointless manual optimizations.

leni53624d ago

> The XOR trick is only cool in its undefined-behavior form:

> a^=b^=a^=b;

I believe this is defined in C++ since C++17, but still undefined in C.

mmozeiko24d ago

xor swap trick was useful in older simd (sse1/sse2) when based on some condition you want to swap values or not:

  tmp = (a ^ b) & mask
  a ^= tmp
  b ^= tmp

If mask = 0xfff...fff then a/b will be swapped, otherwise if mask = 0 then they'll remain the same.

CJeffersonOP24d ago

Oh, that is cool, I’ve never seen that. I might add that to an extended version of the post sometime, I’ll be sure to credit you.

jagged-chisel24d ago

So mask is marking the bits you want swapped and leaving the others in place.

koolala24d ago

Could this still be the ideal way for vectors of Ints in WebGL2?

jesse__24d ago

That's hella cute

fuglede_24d ago

One extension that I ran into, and which I think forms a nice problem is the following:

[0] https://link.springer.com/article/10.1007/s11128-025-04831-5

[1] https://swapple.fuglede.dk

JKCalhoun24d ago

praptak24d ago

imploded_sub24d ago

TFA mentions this as an issue for the trick, whereas you rightly point out it's an evil feature instead.

stinkbeetle24d ago

You can use this same property of xor to make a double-linked list using just one pointer per item, which is xor of the previous and next item addresses!

Panzerschrek24d ago

Except such "optimization" may affect performance on CPUs with speculative execution/prediction. But it should be measured.

stinkbeetle23d ago

Well it is a memory optimization where the interest would be to reduce cache misses or memory consumption. I have never felt the need to use such a thing seriously but I find it a neat trick.

It can only be used when walking the list of course, which is quite likely why it is not more widely used -- it does not provide the benefit that a regular double linked list is mostly used for.

gblargg24d ago

Presumably the reason is to reduce memory overhead for nodes.

HarHarVeryFunny24d ago

> Are there other XOR tricks?

I'm not sure I'd call it a "trick", but since A ^ 0 = A, and B ^ B = 0, then ((A ^ B) ^ B) = A. i.e. XOR-ing any number by the same number twice gets you back the original number.

pcherna24d ago

DeathArrow24d ago

30 years ago, when I wanted to 0 a register in assembly I used something like xor ah, ah because it was a bit more performant.

icelusxl24d ago

pjc5024d ago

IIRC that's because that instruction is one byte while a "load immediate" would have to express 0 as one or more bytes.

(See also the wacky way in which ARM "load immediate" works)

IIAOPSW24d ago

I'll throw my hat in the ring of other "xor" tricks.

We can do this by realizing that our basic operands can be expressed as matrices. x += ky is the same as the matrix [1, k] [0, 1]

likewise y += k

x is equiv to the lower triangular matrix [1, 0] [k, 1]

and lastly, the = operator is equiv to a matrix with an element on the diag. x = k [k ,0] [0, 1]

y *= k [1, 0] [0, k]

From this point on it becomes a challenge of if you can construct any desired matrix into some combination of these available ones (spoiler, yes you can).

I legit don't know. I have some thought, but I won't meander out loud if its not going to lead anywhere.

ranger_danger24d ago

> given a list where every value appears exactly twice except one, XOR all the values together and the duplicates cancel out, leaving the unique element

For some reason this reminds me of the Fourier transform. I wonder if it can be performed with XOR tricks and no complicated arithmetic?

fluoridation24d ago

There's a more general formulation, which is that every value but one must appear even numbers of times, and the one must appear some odd number of times.

twiceaday24d ago

How about every value appears 0 mod n times except one which appears 1 mod n times? :)

Solution: xor is just addition mod 2. Write the numbers in base n and do digit-wise addition mod n (ie without carry). Very intuitive way to see the xor trick.

anematode23d ago

It's not quite the same as XOR swap, but a trick I've found handy is conditionally swapping values using XOR:

    int a, b;
    bool cond;
    
    int swap = cond ? a ^ b : 0;
    a ^= swap;
    b ^= swap;

If cond is highly unpredictable this can work rather nicely.

torginus24d ago

I would remark that modern CPUs don't use physical registers, so swapping should be just a register rename op, and this kind of bithacking only applies to old machines.

The article makes the same point as well at the end:

It is the kind of technique which might have been occasionally useful in the 1980s, but now is only useful for cute interview questions and as a curiosity.

robinsonb524d ago

This brings back memories of Chunky-to-Planar conversion on the Amiga, where the Xor trick was combined with bitshifts and masking to rearrange the bits in 8 32-bit words in reasonable time.

fch4224d ago

xor'ing the elements of lists together also allows a test of "unordered equality" because 1^2^3^4^5 == (xor of any permutation of [1,2,3,4,5]).

Yes there are false positives, and the false negative of all-zero/all-equal, but the test can be useful in a "bloom filter" type case.

Have used it in dynamic firewalling rules ... one can do something pretty close to a JA3/JA4 TLS fingerprint in eBPF with that (to match the cipher lists).

IIAOPSW24d ago

You could also do your weak equality test by taking the sum 1+2+3+4+5, or the product 12345, or any other commutative binary operator.

pjmlp24d ago

Another trick when SIMD is not available, is pseudo SIMD via SWAR, one example,

https://lemire.me/blog/2022/01/21/swar-explained-parsing-eig...

YasuoTanaka24d ago

Used to do this in assembly back when registers actually mattered. Today it mostly hurts readability more than anything.

Panzerschrek24d ago

Tade024d ago

Back when I was in college in the late 00s we were advised to not attempt to optimise using assembly unless we really found a bottleneck the compiler missed.

pjc5024d ago

Things like "add with saturation" and the special AES instructions.

jesse__24d ago

In my experience, relying on the compiler to auto vectorize your code is a path fraught with peril.

It'll break eventually. If it matters, write the simd yourself. It'll probably be 2-50x better than the compiler anyways.

adonovan24d ago

j / k navigate · click thread line to collapse