I suppose it’s the difference between exact time and social/political time (there is probably a better term for that).
I worked at an insurance startup that stored coverage period start/ends as timestamps, which ended up being semantically wrong. The coverage ends at midnight of the next year, wherever _you_ (technically the provider, I think) are. So there is no one single instant in time when coverage ends; a timestamp is the wrong data type for representing this.
A 10am PDT America/Los_Angeles meeting is different from a 6pm BST Europe/London meeting which is different from a 5pm GMT meeting and those will jump around by an hour depending on time changes. If the bulk of your team is in Seattle then you probably want to pick the US timezone for the meeting and not GMT or London. If you have a manager in London, though, who sets up the meeting time in their local timezone then Seattle employees will notice tomorrow that the meeting is a hour later in Seattle than it normally is.
If everyone shitcanned clocks jumping around twice a year this craziness would end.
Some options:
- local time
- legal time
- calendar time
TZ rules are one way, you can derive zoned time reliably from utc, but not the other way around unless you already know the offset. That’s why seemingly storing utc is best, because you can apply whatever tz rule you want. But, because tz rules are decided by politicians (dst, ramadan, date line shifts, it gets pretty weird, …) and often with little advance notice, tz rule databases are often wrong in practice, which means storing only utc and getting zoned time back out of it reliably is hard. That’s why storing iso timestamps with offsets is best. It preserves both utc and intended local (zoned) time at the time of storing, even if the tz database then changes.
But this is not the author’s problem. They’re dealing with wall clock time. They want to store 9 am and have it always be 9 am, no matter what happens with time zones. That is easy: store the timestamp without tz info. But, the hard part is knowing the instant in time (in coordinated time). Wall clock times may occur twice, so there is no single reliable way to get utc out of it, and even when calculating utc the result can be wrong or become wrong due to the political insanity that is zoned time so it is dangerous to store it and rely on it. You see this problem with anything that schedules people or resources. Something scheduled at 9 am is always 9 am, but if the system needs to send a reminder at 9 am, when do you schedule it? There are no solutions I’m aware of without edge cases.
I solved this for a reservation system by storing wall clock time without offsets and the time zone (id) of the resource separately, and then calculating zoned time for that zone and comparing it with the stored wall clock time every time the instant in time mattered. It worked but was tricky to code.
It's often our job to cram human concepts into a small number of bits, so I get why we do what we do. But I think it's sometimes worth being explicit that the anthropological concepts we are dealing with are very complex, and we may leave a lot out when we simplify them enough to make sense to the sort of machine that can be cheaply built in its era.
Databases usually have types to represent just dates or dates/times. Always use the lowest resolution type to represent what you actually are trying to capture.
Rule number one and the only rule about wall clock times is that they are display style only and must never be stored. Local times must use 24 hour format.
That's why people go and store times in UTC (with a separated time zone), and then everything breaks because some tool always expects non-tz data to be on the local tz, or the original system converts tzs behind the scene, or whatever.
It’s not just DST, it’s any TZ-related changes: locations can change TZ for other reasons than dst.
> I suppose it’s the difference between exact time and social/political time.
There’s really no such thing as exact time. The closest is UT1, and it’s not really practical for normal applications.
I would have defined TAI (International Atomic Time) as "exact time". Of course with it being an average of clocks it only really exists in hindsight and isn't perfect either, but if you only need accuracy on the order of hundreds of nanoseconds and are on earth's surface it's very practical and easy to get anywhere with a GPS receiver.
And UTC is only slightly more messy, with its offset from TAI to keep it in sync with unpredictable UT1.
Exactly, and for that time zone info will not be enough, you would need geographic coordinates.
2022-03-13T07:21:39@Europe/London
I suppose that’s also problematic if the location changes to a different time zone due to boundary changes. Maybe we need a coordinate (and planetary body) system: 2022-03-13T07:21:39@Earth/51.5055853,-0.1014699
We need that full frame of reference to be sure we will be correct in future.I think coordinates are not completely right, since the time context is mostly political/cultural, not geographic. I don't have examples but I can easily imagine a border region specifying some event in a time that is referencing the region across the border.
I don’t know where exactly I picked up the word but I’ve used it here https://news.ycombinator.com/item?id=22968405
The better term is "civil time": https://en.wikipedia.org/wiki/Civil_time
:)
How would future time zone change or DST change affect it at all? Is it about your pre-defined time would no longer be aligned with a nice "local time" like 0:00?
Because things would still happen at the exactly same "moment" (absolute time or time delta from now) in UTC regardless any TZ change isn't it?
1. Always store time in UTC
2. If the time is tied to a fixed place, like start of a Football match, store the timezone of the place in another column, for example, Asia/Kolkata, America/New_York, etc
3. Always store user's timezone as part of their preferences. Helps in use cases like send the reminder to the user at 8:00AM in their timezone.
4. All APIs always return time in UTC and user's or the places timezone in the output.
5. It's the job of the frontend to convert the UTC time to proper timezone and display it to the user.
Never had a use case I couldn't solve with these rules.
That is why logs (moment of time in the past) should be stored in UTC (possibly with local or user's timezone name for UI), timers (short offsets into the future from a moment of time in the past) should be stored in UTC like logs, and future scheduled events between humans should be stored in local time with the name of the timezone. The name of a timezone is never the offset, it is always the jurisdiction that can change the rules and be merged into another, etc.
When answering the question "what is the UTC time of a future scheduled appointment?", which is equivalent to "does it intersect a future scheduled appointment set in a different timezone on the same or adjacent dates?", which is also equivalent to the question "how many seconds in the future is this from right now?", you must assume the timezone rules set by politicians will not change between now and the appointment time. If the appointment time is soon (e.g. under 3 months), you indeed can assume this, because legislating an unexpected timezone change that will take effect "soon" is stupid and will usually not be done. But for a scheduled appointment a year from now? You honestly don't know. You can guess, but you must assume this can change by a TZ database update and act accordingly. If you have converted from localtime to UTC when the appointment was created, it is difficult to adjust correctly when TZ db is updated. If you translate to UTC only "just in time" when rendering or detecting collisions, everything will work correctly.
It's not always unique. For example, Xinjiang uses both UTC+8 (Beijing time) and UTC+6 (Xinjiang Time), depending on context. https://en.wikipedia.org/wiki/Xinjiang_Time
Even when unique, it can be hard to automatically find it. The US and Canada typically have timezones that don't exactly follow state/province borders (or even county borders). https://en.wikipedia.org/wiki/Time_in_the_United_States#Boun...
And for most things, even if you can discover it, let the user override your guess.
- a timezone changes
- users change their timezone
Often what the user wants is that the "naive" local time stays the same (e.g. it's still 12am even if the timezone changes).
Through not always.
For example, for a conference whose starting time is stored as 14:00 Hrs UTC (0900 Hrs America/New_York) before DST, will suddenly start at 10:00 Hrs America/New_York after DST. Coincidentally, we just redid our implementation to follow Option 3 before New York's DST kicks in and it works as expected so far.
Please correct me if my understanding of your method is incorrect.
How do you handle an alarm set for 02:30 on the date that the jumps _forward_ from 02:00 to 03:00?
To me it seems impossible to do that kind of reporting, unless the backend is able to use the time zone to accurately convert the UTC time to hour-of-day, including DST shenanigans and whatnot.
Most mature date time libs can handle DST automatically for you when you use "named" timezones. For example, Java can handle it properly.
The rules I follow: 1. If you're recording an event in the past, store it in UTC.
2. If you need to record the local time for something either in the past or the future, store it as the local time as it was (for the past), or the local time as it will be (for the future), plus a timezone (not an offset).
3. If you're displaying a timestamp, a) if you can, display UTC, b) if needed, display in local time using current conversion rules
4. If you are getting user input, a) if you can, get it in UTC, b) if not, get it using a timezone (not an offset)
5. If you are storing a date, then that is not a timestamp. A date is a period, so either just store the date without a timezone, or, if it's a date period in a local area, store the start and end timestamps with their timezones.
Although it is unlikely for a timezone change to be issued after its begin date, it is likely for a timezone change to be entered into a database after its begin date. If you preserve the original local time you can always convert to UTC accurately.
Why not epoch? An integer always means epoch. A human readable timestamp, unless it's exactly ISO8601, is pretty much always ambiguous.
> 2. [...] plus a timezone (not an offset).
Why not offset? ISO8601 is a good standard.
And it takes a very special use case to care about leap seconds. For logging "leap smear" is clearly a better way to solve it, and for time measurements you should always use a monotonic clock, not wall clock.
Oh, and with timezone I assume you mean something like "Europe/New_York", not ambiguous like "EST".
> A date is a period, so either just store the date without a timezone
Of course some dates don't exist in some timezones (and I mean more modern times than gregorian/julian). So you can't count days between two dates without knowing the timezone.
In my opinion there are only two ways to store timestamps that are actually correct: epoch, or iso8601. Everything else will eventually lead to data corruption.
(and even with these two you have to be careful)
An offset would have exactly the same issues as UTC. You need the time zone location to correct for future changes. Knowing it’s UTC+2 doesn’t help with that.
Ultimately it comes down to intent, either relative to space time, or the socially excepted time in a particular location in the future. A time zone represented as a location/city name is the best we currently have, but even can can be wrong if the time zone boundary’s change.
Which epoch? There are many, it’ll change depending on your system.
Also what unit? Seconds, milliseconds, nanoseconds. That will also change depending on your system.
If someone is saying store UTC then they 100% mean store it in ISO8601 or an equivalent standard.
> Why not offset? ISO8601 is a good standard.
Because the offset for timezones change (as is explained in the article). If you just store an offset, then you have no way of knowing which timezone the user meant, and thus you can’t compensate for changes in the timezone definition (again, the primary thrust of the article).
> In my opinion there are only two ways to store timestamps that are actually correct: epoch, or iso8601. Everything else will eventually lead to data corruption.
The entire article is describing why this approach isn’t enough.
So a better option is to store UTC + timezone + revision (i.e. timezone as it was in 2014 to figure out the offset later)
Because in practice, the only ubiquitous APIs for getting time since an epoch don't actually model an elapsed time standard. The odd one out is POSIX time.
That works a lot less well when your analytics don't easily show that "make my coffee" is usually run in the mornings, local time.
Why? As instants they are both equivalent, but one carries additional information, namely the time that was on the wall where and when the event happened.
Fixed offsets will never change, so this really is a frozen snapshot
This realisation came from trying to work out how the hell to write an international scheduling system where the humans couldn't even work out how to schedule a meeting across three time zones.
In the current system, how do I know what time it is in Australia? Two ways: I either look it up, or I know how many hours it is relative to me. Living in Germany now, I know my parents are -8 and some friends are -9. I still always have to do the math. This is still apriori knowledge necessary that the author just skims over
Compare this to if there were no time zones. I would still need to know when their solar noon was. The amount of information I would need to retain is the exact same. Looking it up would also be just the same, except the lookup would tell us the sun position instead of the local time.
Don't get me wrong, I mostly agree with the article's point. The example just never felt adequate.
Case in point: Europe abolished DST absolutely fine to decrease complexity.
But this is not a technical concern. This is a human concern.
Reality is watching one dude in Australia meet his friend an hour early by accident because they couldn't even work it out with local domain knowledge. This happens all the time. The amount of Zoom meetings I get into because no one can work it out or they had their Outlook set up in the wrong time zone because they were travelling is insane.
The only viable solution to these issues is a common, unified zone for scheduling which probably should leverage UTC.
This would now introduce ambiguities in "see you tomorrow for dinner". I thought you said tonight? Yes, tonight is tomorrow. But we can start today and party until dinnertime tomorrow.
No, this doesn't work.
This gets even more complicated when you have to consider the problems of communicating between planets.
"So what should our product do if the definition of the official timezone in Amsterdam changes relative to UTC?"
My friend was even struggling with conversations like "so what if the user hits save exactly when data coverage drops". Not an edge case for techies, but likely an edge case for POs.
Could you expand on that? I've found it best to keep PMs informed even about edge cases, but I'm curious to know what your friend meant?
I usually don't ask, but say something like "We store time as UTC for future events, but apparently this is not the best practice because when timezone rules change, the local time is not updated. Setting up that refactor will be medium complexity. I recommend we take the time to fix it now, so that we will not have awkward moments for event organizers as people show up an hour early."
So, if you go ask him, you could as well ask some random strange on the street and it would be as useful.
Some times the end user knows the answer, but very often their opinion is equally useless.
One might be tempted to ask the PM to decide on the following: What should happen if the user clicks "save" but the network went down? Shall we retry a few times? How many times? With what backoff? Or shall the application be offline-first? How should potential conflicts between the server data and the client data be resolved?
As you can imagine, the PM has already enough on their plate, so, in my experience, it's best to not overwhelm the PM with such questions and take a decision based on "best guess" on what would maximize customer success.
Another anecdote: A PM had to prioritize "integrate with Lufthansa" and "migrate to Java 8". Clearly the latter issue is at the wrong level of detail for a PM.
Everyone has a GPS in their pockets, and timezones and their changes are determined by location anyway.
Also relative velocity and mass. Think about it...
isn't UTC alone is enough?
For example when planning meeting times, the reason local time comes up is ultimately some determinant about when it's going to be day or night - that's the actual metric that meeting planning generally tries to take into account.
Strictly speaking I would say this also applies for storing data about machine to machine events which might be normally considered to be purely sequence based. If we're storing times to determine processing order for example, then there'd be value in storing the lat/lon of the machines generating them, because it's an extra datum which can resolve expected ordering (i.e. based back-calculating latency windows).
I think 3 days ago I was checking on it again, because I was wondering how websites put timestamp in their published RSS/atom feed
https://hackerdaily.io RSS feed made me confused, to select timezone for the RSS feed as they want to think "Yesterday" means different thing for different timezone, not sure why, but I don't want to confused myself thinking how timezone and "Yesterday" work together
HN is using UTC, I realized that by hovering the mouse pointer on the "2 hours ago"
Again, if people don't want to use UTC for future events it's their choice. Personally I would advocate to use UTC and manually calculate the local datetime for future event, even if local timezone changes
You know, I wonder how watches work in this case, how do you alter the time on your watches according to the DST changes? The mechanical engineer of the watch takes this bug into account?
If the convention hall could be booked hourly and had different events every hour, suddenly a slot would've opened up, at the 2 AM on the DST-change Sunday, the hour which would've not existed if the daylight savings law stayed the same...
I did not think of this before...interesting....leave these problems to politicians, software engineers and end users to accommodate, I just want to be a naive programmer
The reason for a timezone specific RSS feed is because, like the site itself, the RSS feed only updates once a day at midnight. And _when_ the day changes depends on the timezone. Therefore we created different RSS feeds, one for each time zone.
A future time moment may be a true time, known exactly, which must be stored as an UTC time.
Otherwise, it may be an event defined in the official time of some place. This must be a distinct data type, because as mentioned in the article, the real time that corresponds to it cannot be known in advance, because the legislation defining the official time may change before the event takes place.
Obviously this second data type must store the complete information that determines the event time, i.e. both the official local time and the place, so that, a short time before the event, the correct conversion to real time, i.e. UTC, can be done, and then the different local times for each participant may be computed, to be displayed in their schedules.
Of course, for displaying, an estimated UTC time for the event can be computed since the beginning, but it should be updated when the event approaches.
The only problem here is when you fail to recognize that the "time" for a future event that is scheduled in local time is not a time, and you attempt to store it using the wrong data type.
Also one must keep in mind that for this second data type, for future events scheduled in local time, no arithmetic operations can be defined, like the difference between two times. The only operation that can be defined for this data type is conversion to and from UTC.
The `timestamp null` time automatically adjusts for the connecting user's timezone. Internally, IIRC it stores the value as UTC, but then converts it on the fly for wherever you're sitting. Why is this awesome? Well, because not everyone is a programmer or expert in time zone, daylight savings, understands UTC, etc. When our non-tech CEO wants to look at a report, he expects it to be relative to the time shown on his watch, which is exactly what this type does.
Is it perfect? No. But does it solve a simple problem eloquently and prevent non-technical users from making bone-headed mistakes? yep.
If I'm sitting in Seattle VPN'd into a network that geolocates to South Carolina, and I send an HTTP request that terminates on a front-end box in Texas and sends a SQL query to a database in Ireland, what timezone will be used in the result?
And why is this better than just making the same calculation in the UI layer?
Example: An order comes in 22:00 UTC, he's sitting in Seattle talking to a parts supplier. Said supplier want to know when the last order came in, so CEO pulls a sales report. It appears on the report as 3pm. This is intuitive and works every single time.
Ironically, that scenario you describe would be quite complex to implement with a ui tweak. What if your data entry is a webapp, the CEO is using Metabase, and your data analyst is using Tableau? You know have to implement timezones 'correctly' in three different pieces of software, sometimes with software you didn't write.
This isn't a complete panacea, there would still be clock drift and updates could still cause time to be non-monotonic. But a timestamp would have a consistent meaning, and the TT->UTC conversion could be taken care of as just another step in localization.
Why on Earth was the decision taken in the first place to have NTP operate off UTC?
Some databases let you input values like “2022-03-13T08:33:26-06:00” and then display them in UTC or your local time zone or whatever time zone you configured. BugQuery at one time was toying with the idea of taking a few additional bytes of information to represent the preferred display timezone of the timestamp, but I have no idea if that ever made it to beta.
1) User schedules a meeting 6 months out. Their time zone is -5 from UTC
2) 3 months later, their local politicians change their timezone to be -4.5 from UTC
3) The database has the wrong time since it has -5 stored and people miss the meeting by 30 minutes
But when you change timezone, the data is shown in the zone you’re in now (I assume it’s stored in UTC and then converted to the current timezone). Which is technically correct, but then looking back, it looks like I was doing a whole lot of walking from 9pm to 11am for a few weeks while I was on a trip (it was actually just a regular daily schedule), which then messes up the daily average step count, because the day boundaries aren’t what they actually were there.
I’d really rather have an option to show it in the timezone that I was in at the time, and then on the daily resolution graphs mark where the time zone changed (with the hours in the difference either skipped or repeated depending on which way).
So, when planning ahead, one takes into account the knowledge of that time to make a assumption of when the event will take place. If the timezone rules change, one should reexamine which assumption works best. If the event is e.g. the time the moon will be eclipsed, UTC works best (without fails). If it is some local event not depended on anything outside the area in which the time zone rules apply, local time probably works best.
But really, really knowing for certain? I would rather store (future) time in UTC and then consider whether changing time zone rules should or should not apply, as opposed to designing a system that could do this correctly in any situation…
In my last project, I convert the time using the timezone information I have stored by doing t at time zone x in postgres, so that it shows the time at that location, but then when it gets to the client, the actual Date object underneath is in the local timezone, even though it is displaying the time in a different timezone. It doesn't really matter because it is just for display purposes essentially.
When I send the time back to the server, I strip the timezone information from the date object and add in my own timezone information to a newly created string in a postgres-specific format and send it back.
For a vast majority of use cases, UTC storage is sufficient. For everything else, you need both the UTC timestamp and some other piece of knowledge depending on the scenario at hand. In many cases, its as simple as a User.UtcOffsetMinutes fact. In other more paranoid scenarios, you may be inclined to store the original timezone/offset in which the UTC fact was originally valid.
Regardless of the scenario, there is no situation where I would want to store a non-UTC timestamp. Everything is UTC with some optional extras.
This was a solid ten years ago. I don't think the problem was ever solved in the general case. We wrote them a script that support would manually run and shift all the analytics forward or back by an hour.
Too bad this didn't come true ...
Timezone is an important context of the event. You can't tell if something happened on Wednesday if you do not know the timezone.
Some apps store that UTC in the row and get re-calc’d as needed.
With other apps we create views or materialized views and update rows for UTC to give a single source for sorting, etc.
Storing the rules version is an improvement I will look into!
But in my view, it's okay to pretend they are the same thing and model them all as instants.
Even in the current climate of potentially large changes coming to the TZ database, I doubt there are many developers who will come out ahead trying to model this difference correctly vs manually handling any edge case misses. More likely, you'll have bugs in your implementation because time logic is awful.