Cosmologically Unique IDs (opens in new tab)

(jasonfantl.com)

467 pointsjfantl4mo ago151 comments

151 comments

102 comments · 32 top-level

lisper4mo ago· 27 in thread

This analysis is not quite fair. It takes into account locality (i.e. the speed of light) when designing UUID schemes but not when computing the odds of a collision. Collisions only matter if the colliding UUIDs actually come into causal contact with each other after being generated. So just as you have to take locality into account when designing UUID trees, you also have to take it into account when computing the odds of an actual local collision. So a naive application of the birthday paradox is not applicable because that ignores locality. So an actual fair calculation of the required size of a random UUID is going to be a lot smaller than the ~800 bits the article comes up with. I haven't done the math, but I'd be surprised if the actual answer is more than 256 bits.

(Gotta say here that I love HN. It's one of the very few places where a comment that geeky and pedantic can nonetheless be on point. :-)

k_roy4mo ago

Reminds me of a time many years ago when I received a whole case of Intel NICs all with the same MAC address.

It was an interesting couple of days before we figured it out.

imglorp4mo ago

How does that happen? Was it an OEM bulk kind of deal where you were expected to write a new MAC for each NIC when deploying them?

exfalso4mo ago

There's a fun hypothesis I've read about somewhere, goes something like this:

As the universe expands the gap between galaxies widens until they start "disappearing" as no information can travel anymore between them. Therefore, if we assume that intelligent lifeforms exist out there, it is likely that these will slowly converge to the place in the universe with the highest mass density for survival. IIRC we even know approximately where this is.

This means a sort of "grand meeting of alien advanced cultures" before the heat death. Which in turn also means that previously uncollided UUIDs may start to collide.

Those damned Vogons thrashing all our stats with their gazillion documents. Why do they have a UUID for each xml tag??

jobigoud4mo ago

It is counter intuitive but information can still travel between places that are so distant that expansion between them is faster than the speed of light. It's just extremely slow (so I still vote for going to the party at the highest density place).

We do see light from galaxies that are receding away from us faster than c. At first the photons going in our direction are moving away from us but as the universe expands over time at some point they find themselves in a region of space that is no longer receding faster than c, and they start approaching.

1 more reply

zimzam4mo ago

I think I missed something: how do galaxies getting further away (divergence) imply that intelligent species will converge anywhere? It isn’t like one galaxy getting out of range of another on the other side of the universe is going to affect things in a meaningful way…

A galaxy has enough resources to be self-reliant, there’s no need for a species to escape one that is getting too far away from another one.

3 more replies

chamomeal4mo ago

I think I sense a strange Battle Royale type game…

kbelder4mo ago

Assuming these are advanced enough aliens, they'll also be bringing with them all the mass they can, to accentuate the effect? I'm imagining things like Niven's ringworld star propulsion.

u1hcw9nx4mo ago

You must consider both time and locality.

From now until protons decay and matter does not exist anymore is only 10^56 nanoseconds.

Sharlin4mo ago

If protons decay. There isn't really any reason to believe they're not stable.

2 more replies

dheera4mo ago

Protons (and mass and energy) could also potentially be created. If this happens, the heat death could be avoided.

Conservation of mass and energy is an empirical observation, there is no theoretical basis for it. We just don't know any process we can implement that violates it, but that doesn't mean it doesn't exist.

adrianN4mo ago

All of physics is „just“ based on empirical observation. It’s still a pretty good tool for prediction.

dinosaurdynasty4mo ago

Conservation laws result from continuous symmetries in the laws of physics, as proven by Noether's theorem.

1 more reply

Etheryte4mo ago

That's such an odd way to use units. Why would you do 10^56 * 10^-9 seconds?

2 more replies

rubyn00bie4mo ago

I got a big laugh at the “only” part of that. I do have a sincere question about that number though, isn’t time relative? How would we know that number to be true or consistent? My incredibly naive assumption would be that with less matter time moves faster sort of accelerating; so, as matter “evaporates” the process accelerates and converges on that number (or close it)?

2 more replies

rbanffy4mo ago

If we think of the many worlds interpretation, how many universes will we be making every time we assign a CCUID to something?

3 more replies

scotty794mo ago

Proton decay is hypothetical.

1 more reply

jl64mo ago

Ah but if we are considering near-infinitesimal probabilities, we should metagame and consider the very low probability that our understanding of cosmology is flawed and light cones aren’t actually a limiting factor on causal contact.

missingdays4mo ago

Sorry, your laptop was produced before FTL was invented, so its MAC address is only recognized in the Milky Way sector.

SkyBelow4mo ago

If we allow FTL information exchange, don't we run into the possibility that the FTL accessible universe is infinite, so unique IDs are fundamnetally not possible? Physics doesn't really do much with this because the observable universe is all that 'exists' in a Russel's Teapot sense.

RobotToaster4mo ago

Would this take into account IDs generated by objects moving at relativistic speeds? It would be a right pain to travel for a year to another planet, arrive 10,000 years late, and have a bunch of id collisions.

lisper4mo ago

I have to confess I have not actually done the math.

9dev4mo ago

Oh no! We should immediately commence work on a new UUID version that addresses this use case.

svnt4mo ago

Maybe the definitions are shifting, but in my experience “on point” is typically an endorsement in the area of “really/precisely good” — so I think what you mean is “on topic” or similar.

Pedantry ftw.

lisper4mo ago

:-)

zeckalpha4mo ago

Don't forget that today's observable universe includes places that will never be able to see us because of the expansion of the universe being faster than the speed of light. There's a smaller sphere for the portion of the universe that we can influence.

ctoth4mo ago

Hanson's Grabby Aliens actually fits really well here if you're looking for some math to base off of.

quijoteuniv4mo ago

The answer is 42. Have it from good source!

kmoser4mo ago· 10 in thread

> A more reasonable upper limit might be to assume that every atom in the observable universe will get one ID (we assume atoms won’t be assigned multiple IDs throughout time, which is a concession). There are an estimated atoms in the universe. Using the same equation as above, we find that we need 532 bits to avoid (probabilistically) a collision up to that point.

This doesn't take into account that you will inevitably want to assign unique IDs to various groups of atoms (e.g. this microchip, that car, etc.). And don't even get me started on assigning unique IDs to each subatomic particle.

wavemode4mo ago

> unique IDs to each subatomic particle

You only need one ID for each type of particle. Since the laws of physics dictate that the particles themselves are indistinguishable from each other.

kmoser4mo ago

Just because a given particle is indistinguishable from another of the same type doesn't mean they are the same actual thing. If you are assigning IDs to each item in the universe for accounting/inventory purposes, you'll still want a separate ID for each particle.

Drakim4mo ago

Don't they have different x,y,z positions?

2 more replies

kbelder4mo ago

Wouldn't the minimum discrete unit be something that is capable of recording the ID? Or, if not, space somewhere would need to identify that that atom has this ID, which would take at least as much space.

In other words, the act of 'assignment' presupposes some mechanism of assignment, and at a certain level of granularity the information needed for that mechanism to function is greater than the information the system can store.

It would be like assigning each byte in a stick of ram a 32 bit random access ID, and trying to store the assignments in the same memory space. Memory addressing only works because we assume a linear, unchanging order.

kmoser4mo ago

If we're talking reality, sure. But you can also consider it a Gedankenexperiment.

Dylan168074mo ago

> This doesn't take into account that you will inevitably want to assign unique IDs to various groups of atoms (e.g. this microchip, that car, etc.).

Sure it does. Those are not going to add up to a single extra bit.

kmoser4mo ago

Even every possible permutation of every single subatomic element in the universe? Even if we just consider atoms, at 10^80 atoms in the entire universe, there are (10^80)! possible permutations, which is many, many, many orders of magnitude larger.

And this isn't even counting sets that include multiples of the same item; once you get into that territory, there really is no upper bound.

2 more replies

NoMoreNicksLeft4mo ago

>And don't even get me started on assigning unique IDs to each subatomic particle.

If a neutrino oscillates between flavors, does it get 3 IDs? Or does it get a new ID with each oscillation?

Thankfully, we only need one electron ID at all.

liamwire4mo ago

For the uninitiated: https://en.wikipedia.org/wiki/One-electron_universe

nivertech4mo ago

So UUIDv∞ will be at least 536 bit long?

And with group IDs, timestamp, etc. - 1024 bit long?

ekipan4mo ago· 8 in thread

I forget the context but the other day I also learned about Snowflake IDs [1] that are apparently used by Twitter, Discord, Instagram, and Mastodon.

Timestamp + random seems like it could be a good tradeoff to reduce the ID sizes and still get reasonable characteristics, I'm surprised the article didn't explore there (but then again "timestamps" are a lot more nebulous at universal scale I suppose). Just spitballing here but I wonder if it would be worthwhile to reclaim ten bits of the Snowflake timestamp and use the low 32 bits for a random number. Four billion IDs for each second.

There's a Tom Scott video [2] that describes Youtube video IDs as 11-digit base-64 random numbers, but I don't see any official documentation about that. At the end he says how many IDs are available but I don't think he considers collisions via the birthday paradox.

[1]: https://en.wikipedia.org/wiki/Snowflake_ID

[2]: https://youtu.be/gocwRvLhDf8

buzzerbetrayed4mo ago

Getting the entire universe to agree on a single clock for creating timestamps sounds absurdly difficult. Probably impossible?

ekipan4mo ago

"Agreement" of time is probably nonsense, yeah. I realized after posting so I edited in the parenthetical, but as [3] notes, locality probably makes this less of a real issue.

Apparently with the birthday paradox 32 bit random IDs only allow some tens of thousands per second before collision chance passes 50%. Maybe that's acceptable?

[3]: https://news.ycombinator.com/item?id=47065241

Zambyte4mo ago

You don't need the universe to agree. You need your ID system to agree within a reasonable margin of error.

speakeron4mo ago

The temperature of the cosmic microwave background can be used as a universal clock.

1 more reply

UltraSane4mo ago

Neutron star spins collectively can be used as a pretty accurate clock.

1 more reply

swiftcoder4mo ago

> [1]: https://en.wikipedia.org/wiki/Snowflake_ID

Isn't this just the same scheme as version 1 UUID, except with half the bits? I guess they didn't want to dedicate 128 bits to their IDs.

bricss4mo ago

There also DRUUID: https://gist.github.com/bricss/53e5babedf44bf3ab71334c15102e...

drchickensalad4mo ago

That also looks like the widely used BSON ids, to anyone else interested

adityaathalye4mo ago· 4 in thread

Just past page 281 of Becky Chambers's delightful "the galaxy, and the ground within".

  Received Message
  Encryption: 0
  From: GC Transit Authority --- Gora System (path: 487-45411-479-4)
  To: Ooli Oht Ouloo (path: 5787-598-66)
  Subject: URGENT UPDATE

Man I love the series.

Looks like this multispecies universe has centrally-agreed-upon path addressing system.

pavel_lishin4mo ago

You should check out Vernor Vinge's A Fire Upon The Deep for more fun examples of how intra-galactic communication would be labeled, with routes & such.

adityaathalye4mo ago

In fact, it is right here in my stack of to-reads! Picking it up now due to your recommendation. Cheers!

Octoth0rpe4mo ago

From this book in particular, I love the scene with everyone sitting around talking about how horrifying the concept of cheese is. The rest of the quartet is wonderful, with the second book (A Closed and Common Orbit) being the MVP IMO.

adityaathalye4mo ago

Before replying, I waited for a moment to re-read those couple of pages once more. Cracked up again... Oh the utter incomprehension. And I can relate to the bit about eating the enzyme so that one can eat the cheese without getting sick. Cheese is horrifyingly great.

linuxhansl4mo ago· 3 in thread

Heh. I once had to make an argument that 256 bit randomly assigned identifiers are good enough without explicit collision checking. People wanted me to add complex and expensive collision checks.

My argument was the 2^256 actually approaches the number of atom in the observable universe (within 1 to 3 orders of magnitude), and that collisions are so unlikely that we'll have millions of datacenter meltdowns first (all assuming we have a good source of random numbers, of course). In the end I convinced everybody that even 128 bits are good enough, without any collision checking required.

I thought my arguments was clever, but this is so much better. :)

da_chicken4mo ago

Nah, it's much easier than that.

The total amount of computer data across all of humanity is less that 1 yottabyte. We're expected to reach 1 yottabyte within the next decade, and will probably do so before 2030. That's all data, everywhere, including nation-states.

The birthday paradox says that you'll reach a 50% chance of at least one collision (as a conservative first order approximation) at the square root of the domain size. sqrt(2^256) is 2^128.

Now, a 256 bit identifier takes up 32 bytes of storage. 2^128 * 32 bytes = 10^16 yottabytes. That's 10 quadrillion yottabytes just to store the keys. And it's even odds whether you'll have a collision or not.

And if the 50% number scares them, well, you'll have a 1% chance of a collision at around... 2^128 * 0.1. Yeah, so you don't reach a 1% over the whole life of the system until you get to a quadrillion yottabytes.

Because you're never getting anywhere near the square root of the size, the chances of any collision occurring are flatly astronomical.

nextaccountic4mo ago

If the mechanism for generating those 256 random bits is distributed and untrusted parties generate ids, then you need collision detection because they may be malicious

If it's not distributed you can just have a counter

If it's distributed but coordinated by a single party (say, it's your servers), you can do sharding on incremented counters. Like, every server are assigned a region of ids

linuxhansl4mo ago

In this case it was distributed without our data centers (10k's machines or so at that time spread around the planet), but the code to generate ids was 100% under our control. Rather than inventing some distributed generation (or collision detection), a stateless approach with random numbers just seemed the right choice.

manofmanysmiles4mo ago· 3 in thread

I'd propose using our current view of physical reality to own a subset of the UIID + version field if new physics is discovered.

10-20 bits: version/epoch

10-20 bits: cosmic region

40 bits: galaxy ID

40 bits: stellar/planetary address

64 bits: local timestamp

This avoids the potentially pathological long chain of provenance, and also encodes coordinates into it.

Every billion years or so it probably makes sense to re-partion.

rbanffy4mo ago

As for coordinates, don’t forget galaxies are clouds of stars flowing around and interacting with each other.

dylan6044mo ago

That's the problem with address type of systems is that they expect the object at that location to always be at that location. How do you encode the orbital speed, radius of orbit for not just the object, but also the object it is orbiting will need the same info as it is also in motion, then that object's parent galaxy's motion. Ugh, now I need a nap to calm down a bit.

1 more reply

skvmb4mo ago

offset length

  00     04:    Version + Flags
  04     08:    Timestamp (uint64)
  12     16:    Node/Agent Hash
  28     16:    Namespace Hash
  44     32:    Random Entropy
  76     20:    Extra / Extension
  96     32:    Integrity Hash

Total: 128bytes

bluecoconut4mo ago· 2 in thread

Fun read.

One upside of the deterministic schemes is they include provenance/lineage. Can literally "trace up" the path the history back to the original ID giver.

Kinda has me curious about how much information is required to represent any arbitrary provenance tree/graph on a network of N-nodes/objects (entirely via the self-described ID)?

(thinking in the comment: I guess if worst case linear chain, and you assume that the information of the full provenance should be accessible by the id, that scales as O(N x id_size), so its quite bad. But, assuming "best case" (that any node is expected to be log(N) steps from root, depth of log(N)) feels like global_id_size = log(N) x local_id_size is roughly the optimal limit? so effectively the size of the global_id grows as log(N)^2? Would that mean: from the 399 bit number, with lineage, would be a lower limit for a global_id_size be like (400 bit)^2 ~= 20 kB (because of carrying the ordered-local-id provenance information, and not relative to local shared knowledge)

AlotOfReading4mo ago

Two ways to frame it:

Provenance is a DAG, so you get a partial order for free by topological sort. That can be extended to a compatible total order. Then provenance for a node is just its ordering. This kind of mapping from objects to the first N consecutive naturals is also a minimal perfect hash function, which have n log n overhead. We can't navigate the tree to track ancestry, but equality implies identical ancestry.

Alternatively, we could track the whole history in somewhat more bits with a succinct encoding, 2N if it's a binary tree.

In practice, deterministic IDs usually accept a 2^-N collision risk to get log n.

montyanne4mo ago

The ATProto underlying BlueSky social network is similar. It uses a content-addressed DAG.

Each “post” has a CID, which is a cryptographic hash of the data. To “prove” ownership of the post, there’s a witness hash that is sent that can be proved all the way up the tree to the repo root hash, which is signed with the root key.

Neat way of having data say “here’s the data, and if you care to verify it, here’s an MST”.

notepad0x904mo ago· 2 in thread

Use a deck of cards for representation. 52 digits where 'K♠' for king-of-spades would be one character in Unicode. it isn't just cosmological unique, it's easier to read, harder to edit manually, and easier for our pattern recognition to keep track of.

And best feature: anyone can generate a random id of such representation by getting a deck of cards and shuffling it properly. Playing cards are ubiquitous. I can see a camera "reading" the decks after they've been splayed on a table after a shuffle. This might even make a better random number seeds.

You're not sure if there is any demand for this sort of stuff? Look at dicekeys:

https://dicekeys.com/

nkrisc4mo ago

> shuffling it properly.

I think you glossed over the big weakness in the idea.

notepad0x904mo ago

is it harder than rolling a dice?

stonegray4mo ago· 2 in thread

Specifying a CSPRNG as an entropy source to avoid collision is incorrect.

CSPRNGs make prediction of the next number difficult (cracking-AES difficulty) but do not add entropy and must be seeded uniquely otherwise they will output the same numbers. Unless the author is proposing having the same machine generate a single universe-scale list in one run.

Also “banning” ids that are all 1s or 0s is silly; they are just as valid and unique as any other number if you’re generating them properly. Although I might suggest purchasing a lottery ticket if you get an UUID with all settable bits as 1.

left-struck4mo ago

Banning 0s might be to avoid conflicts of with testing? Kind of like how you’d want to block logins with emails that have a domain example.com. Idk I’m grasping at straws

nkrisc4mo ago

It’s good to have some known invalid identifiers. They are times where you want to use one that can’t possibly be valid. Having them be easily memorable and obviously invalid is good too.

Imagine if example.com was freely available for anyone to register, think of all the email they could get.

ktpsns4mo ago· 2 in thread

Quite offtopic, but: I found UUIDs being overused in many cases. People then abused them to store data, making them effectively "speaking IDs" or "multi column indices".

jmole4mo ago

Unless it's a key that needs to be sortable (e.g. insertion order) or a metric/descriptor of some kind, I'm not sure why UUID would be overused or inappropriate for use.

efitz4mo ago

Random UUIDs are not compressible. They are also frequently stored as 38-character strings.

frikit4mo ago· 2 in thread

The best way to solve this is not to, and just giving up on the idea of identification.

If you have an infinite multiverse of infinite universes, and perhaps layers on that, with different physics, etc., you can’t have identity outside of all existence.

In Judaism, one/the name of God is translated as “I am”. I believe this is because God’s existence is all, transcending whatever concepts you have of existence or of IDs. That ID is the only ID.

So, the cosmic solution to IDs is the name of God.

mock-possum4mo ago

which name of god though - there are hundreds, and were right back at the same place of struggling to come up with a unique identifier.

roywiggins4mo ago

gotta be careful:

https://hex.ooo/library/nine_billion_names_of_god.html

1 more reply

j-pb4mo ago· 1 in thread

Great insights and visualisations!

I build a whole database around the idea of using the smallest plausible random identifiers, because that seems to be the only "golden disk" we have for universal communication, except for maybe some convergence property of latent spaces with large enough embodied foundation models.

It's weird that they are really under appreciated in the scientific data management and library science community, and many issues that require large organisations at the moment could just have been better identifiers.

To me the ship of Theseus question is about extrinsic (random / named) identifiers vs. intrinsic (hash / embedding) identifiers.

https://triblespace.github.io/triblespace-rs/deep-dive/ident...

https://triblespace.github.io/triblespace-rs/deep-dive/tribl...

ctoth4mo ago

Entity identity can be intrinsic. Why not consistency contracts?

rini174mo ago· 1 in thread

From real life we know that people prefer to have multiple anonymous IDs, or self-selected handles, either makes fully deterministic generation schemes moot.

Also, network routing requires objects that have multiple addresses.

Physics side of whole thing is funny too, afaik quantum particles require fungibility, i.e. by doxxing atoms you unavoidably change the behavior of the system.

pavel_lishin4mo ago

> From real life we know that people prefer to have multiple anonymous IDs

There's nothing stopping a entity from requesting multiple IDs from one of the "devices"!

factotvm4mo ago· 1 in thread

> In order to fix this, we might start sending out satellites in every direction

Minor correction: Satellites don't go in every direction; they orbit. Probes or spaceships are more appropriate terms.

fluoridation4mo ago

Maybe they meant at every inclination. ;)

QuiCasseRien4mo ago· 1 in thread

I really love everything related to Cosmology but I always struggle with two contrary concepts that lead to paradox (for me) :

- Infinity : from school, we learn our universe is infinite.

- We often do calculation with upper limit like this one : 10^240. This is a big number butttttt it's not infinite you know. 10^240+1, 10^240+2...

So :

1. if it's infinite, why doing upper limit calculation ?

2. if it's limited, what is there outside that limit ?

Extremly paradoxal

brainwad4mo ago

People say the universe is "infinite" because spacetime's curvature is, as far as we can tell, flat, and so it should continue in all directions without ever wrapping back on itself (unlike, say, the Earth, which has spherical curvature).

But practically it's finite because we are only in causal contact with things up to 13.7b ly from us, and given space appears to be expanding at an accelerating rate, we probably will never get into causal contact with (almost all of) the part of the infinite universe outside of our light cone, even though things ought to exist over the "horizon". So only a tiny infinitesimal sliver of the infinite universe is knowable by us.

philipwhiuk4mo ago· 1 in thread

Note that they almost immediately contract from 'the universe' to 'the visible universe', which isn't the same thing at all.

mr_mitm4mo ago

It's observable universe, and that's the only thing that matters. Events outside the observable universe are causally disconnected. We will never interact with anything outside the observable universe. For all practical purposes, it's the same thing.

m4nu3l4mo ago

A more realistic estimate of the total number of addressable things should take into account that for anything to be addressable, its address should be stored somewhere at least once.

If it takes at least Npb particles to store one bit of information, then the number of addressable things would decrease with the number of bits of the address.

So let's call Nthg the number of addressable things, and assume the average number of bits per address grows with Nb = f(Ntng).

Then the maximum number of addressable things is the number that satisfies Nthg = Np/(Npb*f(Ntng)), where Np is the total number of particles.

program_whiz4mo ago

Is it possible to construct an ID using some kind of centralized observable phenomena? Due to how time and distance distinguish things, would they always be unique? Like only one person will ever simultaneously observe stars in certain positions and intensities, color, etc. Similar to how I've heard some companies use lava lamps or other noisy processes to generate entropy.

I guess I'm wondering if there is a way to construct a universal coordinate frame for the whole universe? If so, then its possible to trivially assign local time + x + y + z + salt to make unique ids.

moktonar4mo ago

The random uuid selection is far superior because of lifespan, you can only have so many functioning devices at the same time, and on the contrary to tree-based uuids once a device is decommissioned the uuid can be reclaimed. Practically though it would probably be a mixed algorithm where positioning would give the id root and the rest is selected randomly

vessenes4mo ago

Chiming in from the decentralized world - there’s an adversarial / cooperative dynamic in the assignment of these IDs - and the selection of parents, not discussed in the original. I think you could possibly get to sub linear by allowing a small number of cooperative nodes to assign new IDs.

On the contrary, having the right to assign IDs is powerful; on balance, to my mind the right thing to do is some sort of a ZK verifiable random function, e.g. sunspot-based transformations combined with some proof of ‘fair’ random choice. In that case, I think the 800 bit number seems like plenty. You could also do some sort of epoch-based variable length, where for the next billion years or so, we use 1/256 of the ID space, (forced first bit to 0), and so on.

kelseyfrog4mo ago

The Dewey section and Elias omega encoding was fun, but it reminded me of Project Xanadu's tumblers[1] - a variable length dotted notion where each segment is unbounded.

Tumblers are modeled using transfinite numbers which makes me wonder: what are the similarities and differences between transfinite numbers and Elias omega encoding? I'm not well versed in either, so I expect it's either a question from ignorance or I may have a lot to learn. :)

1. https://www.artandpopularculture.com/Tumbler_%28Project_Xana...

cuttothechase4mo ago

One could take anything like a cell and split it into genes, molecules, atoms, sub-atomic wave functions (with infinite value range) and take time which can be split into another infinite entity say even within a finite interval. How does this analysis account for that?

I could split this object into 10^500 or 10^50^500^5000 etc., with imagination being the limit.

These values Id'd at whatever imaginable resolution are far from practically useful but at a cosmic scale, there is no telling what is a useful value?

So this framework seems to be more limiting because we define a resolution ?

MagicMoonlight4mo ago

The obvious solution is a system like IP addresses. Every system has an address like universe.galaxy.region.system or whatever, then the system is subdivided in whatever way is logical for that system.

That way you can route ships or data or whatever to a specific system in a logical way. Each system decides how to allocate addresses. Since most systems won’t have anything or anyone to care, something like NASA or registrars would just allocate a block to the system and give large things like planets an address.

efitz4mo ago

I’m going to vibe code an app that lets you register a computronium unique id (is that name taken?) I’ll corner the market.

I’m also going to devise a standard that arbitrarily breaks it into groups of hexadecimal digits of arbitrary length in the spirit of UUIDs, and reserve a prefix space for Planck-unit timestamps (computronium-ID-7) so that you can lexicographically sort your COMPID7s.

Man I got to get out in front of this.

small_model4mo ago

We will probably end up with something like each planet has its own local addressing, and the big router in the sky does NAT, each solar system has a router and so on.

WhitneyLand4mo ago

800 bits is an incomprehensible number of possibilities…yet tiny in comparison to the number of .gifs that could be drawn.

tenthirtyam4mo ago

Hmm. There might be 10^80 atoms in the universe, however there are 2^(10^80) possible combinations, more than 2^800.

dietsche4mo ago

but can you have an id for every id?

let_tim_cook_4mo ago

"372 bits for 1-gram nanobots"... smh, this is why people call us nerds

eudamoniac4mo ago

I was going to read this, but it starts with an AI slop header image for no purpose, so I intuited that the article was similarly ill constructed.

dvh4mo ago

Another blow to the "all electrons are the same electron" theory. Why have only 1 electron with so many possible ids /s

qewartysuc4mo ago

xhxxhxhxhxhxhxhx

j / k navigate · click thread line to collapse

151 comments

102 comments · 32 top-level

lisper4mo ago· 27 in thread

(Gotta say here that I love HN. It's one of the very few places where a comment that geeky and pedantic can nonetheless be on point. :-)

k_roy4mo ago

Reminds me of a time many years ago when I received a whole case of Intel NICs all with the same MAC address.

It was an interesting couple of days before we figured it out.

imglorp4mo ago

How does that happen? Was it an OEM bulk kind of deal where you were expected to write a new MAC for each NIC when deploying them?

exfalso4mo ago

There's a fun hypothesis I've read about somewhere, goes something like this:

This means a sort of "grand meeting of alien advanced cultures" before the heat death. Which in turn also means that previously uncollided UUIDs may start to collide.

Those damned Vogons thrashing all our stats with their gazillion documents. Why do they have a UUID for each xml tag??

jobigoud4mo ago

1 more reply

zimzam4mo ago

A galaxy has enough resources to be self-reliant, there’s no need for a species to escape one that is getting too far away from another one.

3 more replies

chamomeal4mo ago

I think I sense a strange Battle Royale type game…

kbelder4mo ago

Assuming these are advanced enough aliens, they'll also be bringing with them all the mass they can, to accentuate the effect? I'm imagining things like Niven's ringworld star propulsion.

u1hcw9nx4mo ago

You must consider both time and locality.

From now until protons decay and matter does not exist anymore is only 10^56 nanoseconds.

Sharlin4mo ago

If protons decay. There isn't really any reason to believe they're not stable.

2 more replies

dheera4mo ago

Protons (and mass and energy) could also potentially be created. If this happens, the heat death could be avoided.

adrianN4mo ago

All of physics is „just“ based on empirical observation. It’s still a pretty good tool for prediction.

dinosaurdynasty4mo ago

Conservation laws result from continuous symmetries in the laws of physics, as proven by Noether's theorem.

1 more reply

Etheryte4mo ago

That's such an odd way to use units. Why would you do 10^56 * 10^-9 seconds?

2 more replies

rubyn00bie4mo ago

2 more replies

rbanffy4mo ago

If we think of the many worlds interpretation, how many universes will we be making every time we assign a CCUID to something?

3 more replies

scotty794mo ago

Proton decay is hypothetical.

1 more reply

jl64mo ago

missingdays4mo ago

Sorry, your laptop was produced before FTL was invented, so its MAC address is only recognized in the Milky Way sector.

SkyBelow4mo ago

RobotToaster4mo ago

lisper4mo ago

I have to confess I have not actually done the math.

9dev4mo ago

Oh no! We should immediately commence work on a new UUID version that addresses this use case.

svnt4mo ago

Pedantry ftw.

lisper4mo ago

:-)

zeckalpha4mo ago

ctoth4mo ago

Hanson's Grabby Aliens actually fits really well here if you're looking for some math to base off of.

quijoteuniv4mo ago

The answer is 42. Have it from good source!

kmoser4mo ago· 10 in thread

wavemode4mo ago

> unique IDs to each subatomic particle

You only need one ID for each type of particle. Since the laws of physics dictate that the particles themselves are indistinguishable from each other.

kmoser4mo ago

Drakim4mo ago

Don't they have different x,y,z positions?

2 more replies

kbelder4mo ago

kmoser4mo ago

If we're talking reality, sure. But you can also consider it a Gedankenexperiment.

Dylan168074mo ago

> This doesn't take into account that you will inevitably want to assign unique IDs to various groups of atoms (e.g. this microchip, that car, etc.).

Sure it does. Those are not going to add up to a single extra bit.

kmoser4mo ago

And this isn't even counting sets that include multiples of the same item; once you get into that territory, there really is no upper bound.

2 more replies

NoMoreNicksLeft4mo ago

>And don't even get me started on assigning unique IDs to each subatomic particle.

If a neutrino oscillates between flavors, does it get 3 IDs? Or does it get a new ID with each oscillation?

Thankfully, we only need one electron ID at all.

liamwire4mo ago

For the uninitiated: https://en.wikipedia.org/wiki/One-electron_universe

nivertech4mo ago

So UUIDv∞ will be at least 536 bit long?

And with group IDs, timestamp, etc. - 1024 bit long?

ekipan4mo ago· 8 in thread

I forget the context but the other day I also learned about Snowflake IDs [1] that are apparently used by Twitter, Discord, Instagram, and Mastodon.

[1]: https://en.wikipedia.org/wiki/Snowflake_ID

[2]: https://youtu.be/gocwRvLhDf8

buzzerbetrayed4mo ago

Getting the entire universe to agree on a single clock for creating timestamps sounds absurdly difficult. Probably impossible?

ekipan4mo ago

"Agreement" of time is probably nonsense, yeah. I realized after posting so I edited in the parenthetical, but as [3] notes, locality probably makes this less of a real issue.

Apparently with the birthday paradox 32 bit random IDs only allow some tens of thousands per second before collision chance passes 50%. Maybe that's acceptable?

[3]: https://news.ycombinator.com/item?id=47065241

Zambyte4mo ago

You don't need the universe to agree. You need your ID system to agree within a reasonable margin of error.

speakeron4mo ago

The temperature of the cosmic microwave background can be used as a universal clock.

1 more reply

UltraSane4mo ago

Neutron star spins collectively can be used as a pretty accurate clock.

1 more reply

swiftcoder4mo ago

> [1]: https://en.wikipedia.org/wiki/Snowflake_ID

Isn't this just the same scheme as version 1 UUID, except with half the bits? I guess they didn't want to dedicate 128 bits to their IDs.

bricss4mo ago

There also DRUUID: https://gist.github.com/bricss/53e5babedf44bf3ab71334c15102e...

drchickensalad4mo ago

That also looks like the widely used BSON ids, to anyone else interested

adityaathalye4mo ago· 4 in thread

Just past page 281 of Becky Chambers's delightful "the galaxy, and the ground within".

  Received Message
  Encryption: 0
  From: GC Transit Authority --- Gora System (path: 487-45411-479-4)
  To: Ooli Oht Ouloo (path: 5787-598-66)
  Subject: URGENT UPDATE

Man I love the series.

Looks like this multispecies universe has centrally-agreed-upon path addressing system.

pavel_lishin4mo ago

You should check out Vernor Vinge's A Fire Upon The Deep for more fun examples of how intra-galactic communication would be labeled, with routes & such.

adityaathalye4mo ago

In fact, it is right here in my stack of to-reads! Picking it up now due to your recommendation. Cheers!

Octoth0rpe4mo ago

adityaathalye4mo ago

linuxhansl4mo ago· 3 in thread

Heh. I once had to make an argument that 256 bit randomly assigned identifiers are good enough without explicit collision checking. People wanted me to add complex and expensive collision checks.

I thought my arguments was clever, but this is so much better. :)

da_chicken4mo ago

Nah, it's much easier than that.

The birthday paradox says that you'll reach a 50% chance of at least one collision (as a conservative first order approximation) at the square root of the domain size. sqrt(2^256) is 2^128.

Because you're never getting anywhere near the square root of the size, the chances of any collision occurring are flatly astronomical.

nextaccountic4mo ago

If the mechanism for generating those 256 random bits is distributed and untrusted parties generate ids, then you need collision detection because they may be malicious

If it's not distributed you can just have a counter

If it's distributed but coordinated by a single party (say, it's your servers), you can do sharding on incremented counters. Like, every server are assigned a region of ids

linuxhansl4mo ago

manofmanysmiles4mo ago· 3 in thread

I'd propose using our current view of physical reality to own a subset of the UIID + version field if new physics is discovered.

10-20 bits: version/epoch

10-20 bits: cosmic region

40 bits: galaxy ID

40 bits: stellar/planetary address

64 bits: local timestamp

This avoids the potentially pathological long chain of provenance, and also encodes coordinates into it.

Every billion years or so it probably makes sense to re-partion.

rbanffy4mo ago

As for coordinates, don’t forget galaxies are clouds of stars flowing around and interacting with each other.

dylan6044mo ago

1 more reply

skvmb4mo ago

offset length

  00     04:    Version + Flags
  04     08:    Timestamp (uint64)
  12     16:    Node/Agent Hash
  28     16:    Namespace Hash
  44     32:    Random Entropy
  76     20:    Extra / Extension
  96     32:    Integrity Hash

Total: 128bytes

bluecoconut4mo ago· 2 in thread

Fun read.

One upside of the deterministic schemes is they include provenance/lineage. Can literally "trace up" the path the history back to the original ID giver.

Kinda has me curious about how much information is required to represent any arbitrary provenance tree/graph on a network of N-nodes/objects (entirely via the self-described ID)?

AlotOfReading4mo ago

Two ways to frame it:

Alternatively, we could track the whole history in somewhat more bits with a succinct encoding, 2N if it's a binary tree.

In practice, deterministic IDs usually accept a 2^-N collision risk to get log n.

montyanne4mo ago

The ATProto underlying BlueSky social network is similar. It uses a content-addressed DAG.

Neat way of having data say “here’s the data, and if you care to verify it, here’s an MST”.

notepad0x904mo ago· 2 in thread

You're not sure if there is any demand for this sort of stuff? Look at dicekeys:

https://dicekeys.com/

nkrisc4mo ago

> shuffling it properly.

I think you glossed over the big weakness in the idea.

notepad0x904mo ago

is it harder than rolling a dice?

stonegray4mo ago· 2 in thread

Specifying a CSPRNG as an entropy source to avoid collision is incorrect.

left-struck4mo ago

Banning 0s might be to avoid conflicts of with testing? Kind of like how you’d want to block logins with emails that have a domain example.com. Idk I’m grasping at straws

nkrisc4mo ago

It’s good to have some known invalid identifiers. They are times where you want to use one that can’t possibly be valid. Having them be easily memorable and obviously invalid is good too.

Imagine if example.com was freely available for anyone to register, think of all the email they could get.

ktpsns4mo ago· 2 in thread

Quite offtopic, but: I found UUIDs being overused in many cases. People then abused them to store data, making them effectively "speaking IDs" or "multi column indices".

jmole4mo ago

Unless it's a key that needs to be sortable (e.g. insertion order) or a metric/descriptor of some kind, I'm not sure why UUID would be overused or inappropriate for use.

efitz4mo ago

Random UUIDs are not compressible. They are also frequently stored as 38-character strings.

frikit4mo ago· 2 in thread

The best way to solve this is not to, and just giving up on the idea of identification.

If you have an infinite multiverse of infinite universes, and perhaps layers on that, with different physics, etc., you can’t have identity outside of all existence.

So, the cosmic solution to IDs is the name of God.

mock-possum4mo ago

which name of god though - there are hundreds, and were right back at the same place of struggling to come up with a unique identifier.

roywiggins4mo ago

gotta be careful:

https://hex.ooo/library/nine_billion_names_of_god.html

1 more reply

j-pb4mo ago· 1 in thread

Great insights and visualisations!

To me the ship of Theseus question is about extrinsic (random / named) identifiers vs. intrinsic (hash / embedding) identifiers.

https://triblespace.github.io/triblespace-rs/deep-dive/ident...

https://triblespace.github.io/triblespace-rs/deep-dive/tribl...

ctoth4mo ago

Entity identity can be intrinsic. Why not consistency contracts?

rini174mo ago· 1 in thread

From real life we know that people prefer to have multiple anonymous IDs, or self-selected handles, either makes fully deterministic generation schemes moot.

Also, network routing requires objects that have multiple addresses.

Physics side of whole thing is funny too, afaik quantum particles require fungibility, i.e. by doxxing atoms you unavoidably change the behavior of the system.

pavel_lishin4mo ago

> From real life we know that people prefer to have multiple anonymous IDs

There's nothing stopping a entity from requesting multiple IDs from one of the "devices"!

factotvm4mo ago· 1 in thread

> In order to fix this, we might start sending out satellites in every direction

Minor correction: Satellites don't go in every direction; they orbit. Probes or spaceships are more appropriate terms.

fluoridation4mo ago

Maybe they meant at every inclination. ;)

QuiCasseRien4mo ago· 1 in thread

I really love everything related to Cosmology but I always struggle with two contrary concepts that lead to paradox (for me) :

- Infinity : from school, we learn our universe is infinite.

- We often do calculation with upper limit like this one : 10^240. This is a big number butttttt it's not infinite you know. 10^240+1, 10^240+2...

So :

1. if it's infinite, why doing upper limit calculation ?

2. if it's limited, what is there outside that limit ?

Extremly paradoxal

brainwad4mo ago

philipwhiuk4mo ago· 1 in thread

Note that they almost immediately contract from 'the universe' to 'the visible universe', which isn't the same thing at all.

mr_mitm4mo ago

m4nu3l4mo ago

A more realistic estimate of the total number of addressable things should take into account that for anything to be addressable, its address should be stored somewhere at least once.

If it takes at least Npb particles to store one bit of information, then the number of addressable things would decrease with the number of bits of the address.

So let's call Nthg the number of addressable things, and assume the average number of bits per address grows with Nb = f(Ntng).

Then the maximum number of addressable things is the number that satisfies Nthg = Np/(Npb*f(Ntng)), where Np is the total number of particles.

program_whiz4mo ago

moktonar4mo ago

vessenes4mo ago

kelseyfrog4mo ago

The Dewey section and Elias omega encoding was fun, but it reminded me of Project Xanadu's tumblers[1] - a variable length dotted notion where each segment is unbounded.

1. https://www.artandpopularculture.com/Tumbler_%28Project_Xana...

cuttothechase4mo ago

I could split this object into 10^500 or 10^50^500^5000 etc., with imagination being the limit.

These values Id'd at whatever imaginable resolution are far from practically useful but at a cosmic scale, there is no telling what is a useful value?

So this framework seems to be more limiting because we define a resolution ?

MagicMoonlight4mo ago

efitz4mo ago

I’m going to vibe code an app that lets you register a computronium unique id (is that name taken?) I’ll corner the market.

Man I got to get out in front of this.

small_model4mo ago

We will probably end up with something like each planet has its own local addressing, and the big router in the sky does NAT, each solar system has a router and so on.

WhitneyLand4mo ago

800 bits is an incomprehensible number of possibilities…yet tiny in comparison to the number of .gifs that could be drawn.

tenthirtyam4mo ago

Hmm. There might be 10^80 atoms in the universe, however there are 2^(10^80) possible combinations, more than 2^800.

dietsche4mo ago

but can you have an id for every id?

let_tim_cook_4mo ago

"372 bits for 1-gram nanobots"... smh, this is why people call us nerds

eudamoniac4mo ago

I was going to read this, but it starts with an AI slop header image for no purpose, so I intuited that the article was similarly ill constructed.

dvh4mo ago

Another blow to the "all electrons are the same electron" theory. Why have only 1 electron with so many possible ids /s

qewartysuc4mo ago

xhxxhxhxhxhxhxhx

j / k navigate · click thread line to collapse