Sequence numbers are per-thread and worker numbers are chosen at startup via zookeeper (though that’s overridable via a config file).
https://blog.twitter.com/engineering/en_us/a/2010/announcing...
It would be nice to see ULID recognized by some standards body that isn't just "itself", but also unlike UUID version changes doesn't need to be. The new UUID versions need to be backward and forward compatible with existing standards documents "universally", whereas ULID is designed to be entirely self-contained.
This is bad approach. If you know one ULID, you can with high probability deduce next one. Don't use this approach.
Why with high probability? Because generating several ids within the same millisecond is extremely common case when you're doing batching.
I recently gave thought to it and actually implemented several algorithms and compared them each to each other. I don't care about standards, sorry (I think that standard UUID is oxymoron, UUID is 128 bit and that's about it).
So the best approach I've found is:
6 bits for version/variant (if you don't care about standards you can use those bits for better randomness)
first 48 bits is unix milliseconds.
Then you have 12 more bits in the fist 64 bits. There're two approaches to use them:
either use them as a nanoseconds (nanoseconds_part * 4096 / 1_000_000).
Or use them as a counter if several ids are generated in the same millisecond. It allows for 4096 values per millisecond. Counter should be used when you can't access nanosecond timer like with browser JavaScript.
Then you have 2-3 bits for variant and rest 62-61 bits for pure randomness. Or just 64 bits for randomness. This is enough for security.
If you need to generate more than one ID per 1/4 of microsecond (approach one) or more than 4096 ids per millisecond (approach two), you can just keep generating random part until it's greater than previously generated one. It slightly reduces randomness but not by much.
I'm pretty much sure that my approach is the best approach. It allows for high speed of generation (like 10 000 000 / second with nanosecond approach with unoptimized Java) and good security.
If you insist on using this ULID approach, I suggest to apply the aforementioned approach: don't just increment random bits, generate random bits until you've got higher value.
Of course one should only use this ascending UUID thing (that's what I call it) when you need it. Random UUIDs should be used by default and ascending UUIDs only when you use those as primary keys for RDBMS.
And of course keep in mind that you're leaking generation timestamp which might be a bad thing.
(Though yes, anyone opting into that behavior should beware that it reduces the entropy strength of generated IDs.)
Obviously, applications exist where you do need a total order of IDs/events and ULID may not be the best choice for those, but don't underestimate the usage scenarios of partially ordered, stable sorts.
ULIDs have a single, consistent sort that matches both byte patterns and string representation. That's a huge semantic difference.
Sure, ULIDs make no claims to accurate sorting or total ordering or monoticity beyond a single machine, but ULIDs aren't designed to be a Snowflake/Thrift replacement, they are designed to be a UUID replacement. You are correct that they make no more guarantees than UUIDs, but they don't have to, that was out of scope of their design. I can understand how that makes it less useful for some of your applications, but that doesn't make it not useful for all sort of applications. (Including many applications that once used UUIDs successfully but want something with a cleaner string representation and fewer cross-platform sorting headaches.)
To give the appearance of being sortable to those who are less familiar with how they are generated is potentially dangerous and misleading. And whilst it is true to say that these IDs will consistently sort in the same order, that is equally true of standard GUIDs etc - the difference being the latter does not lead people to believe that the order has inherent meaning, which the former does.
It's a little similar to how the designers of Go noticed that people were relying on the ordering of keys in maps matching the order items were added. So range iteration over keys was specifically changed to start from a 'random' point in the sequence (not truly random, but enough to stop people relying on it). They understood that the appearance of consistency without the fact of consistency leads to errors.
They have their uses I'm sure, they just need to be carefully considered and clearly understood uses.