In every case where I've seen this problem, it's a matter of people either not storing the timezone along with the local time, or not storing in UTC time. A local time with a timezone is a unique time, it does not occur twice. A UTC time additionally does not occur twice. Store a time zone along with the date and time or store in UTC and convert on use.
Note: If there are instances where a second is repeated, it's rare special occurrence that developing a formalized interface for seems like overkill.
If you define timezone as an IANA timezone, this is incorrect: a whole slew of local times repeat during a DST fallback event: you'll have a (1:30 AM (dst=True), America/New_York), and then a (1:30 AM (dst=False), America/New_York); that "dst=True|False" bit is the only difference, and that needs to get stored. If you consider "America/New_York" to be the TZ, then storing that bit on the TZ isn't appropriate, as it depends on a particular timestamp.
If you've ever worked with PyTZ, there's a sort of rule of "just call normalize() always"; otherwise, you'll get funny answers to some introspections on the datetime instance: things like the offset being not what a local would say the offset should be. My understanding is that pytz stores the dst flag on the timezone instance itself; things get funny because the timezone instance is not given a chance to update after arithmetic on the datetime instance.
(Really, I feel like the whole thing would work better if there was a separate class for "instant in time" and a function for, "convert this instant in time to Gregorian year/month/day/etc. in this TZ", which then returned a broken-out-type. (And a reverse, of course, for building "Instant" instances.))
UTC datetime + IANA TZ (if relevant) is the way to go. Alas, not all data is so nice.
Personally, I just always convert to UTC and store that. It changes the problem from one of data fidelity to display or computation annoyance, and annoyances are easy to reduce or eliminate with tooling.
In the US, once a year you either have to have some locking system set up to avoid the second run (leave a trace that the job has already been started for the day). I believe the person proposing this thinks that it would make this determination easier.
The problem is that it only solves half the trouble caused by dual timezones. There's another day of the year when 2:30am (in the US) doesn't happen at all. If something is scheduled to occur once per day at 2:30am, then that day it is not going to happen.
There are other workarounds available such as avoiding the magic hours around 2am (in the US). But it seems to be a common problem that everyone seems to keep re-solving.
Storing the timezone with the time doesn't really solve the above issue.
In the instance where someone wants something to happen at 1:30 AM, if they aren't specific in the specification, then they should expect that it may happen twice or not at all at at certain times of the year. This is an imprecise specification problem, not a problem in representing time in structures that can and do contain timezones. That is, it's a failure of cron, or the user specifying the time, take your pick. What it's not is a failure of dates, times and timezones, which specifically address this problem. Timezones or UTC (which is just timezone offset 0), are what we have to deal with this specific problem.
For example, specifying the originating timezone would disambiguate the time, as would specifying it in UTC, (or automatically converting to and using the UTC equivalent on entry).
> Storing the timezone with the time doesn't really solve the above issue.
It doesn't address the "do this thing at this local time daily" problem when someone chooses a time that has special behavior, but it does address the "do this thing at this offset from UTC daily" which may be the best you can expect when specifying a time for a recurring action and not taking into account timezones. If you want to use local time, you have to deal with either the actually time the job runs possibly shifting slightly throughout the year in some locations, or possibly running twice or not at all.
My real problem with the proposal is that adding a .first() method doesn't solve anything, and really just makes half the problem (it doesn't work if the time doesn't exist), and in a way that's already easily solved, since you can't get a valid result from .first without knowing the timezone already.
As far as I read it this pep should eventually lead to a better pytz api. Currently, we have to use the normalize() and localize() functions in order to handle the ambiguous times that happen twice a year when the clocks change, this is ugly and hard to remember to do. I think that the 'first' flag should eventually allow us (once suitable timezones are created in pytz) to do arithmetic with local times and automatically transform between the summer and winter timezones.
I also completely second your point. Naive datetime objects should only ever be used to describe UTC. Anything else should have an explicit timezone attached to it. Doing otherwise is asking for trouble.
In POSIX time, seconds can repeat[1],
> [POSIX] is neither a linear representation of time nor a true representation of UTC […] The Unix time number increases by exactly 86400 each day […] Observe that when a positive leap second occurs (i.e., when a leap second is inserted) the Unix time numbers repeat themselves.
The upside to this is that days are always 86400 seconds "long": computing the start of the next day is simple. (Computing the length is not so much, and computing to-the-second elapsed time is also harder.)
This is somewhat relevant to Python, as the datetime module "ignores" (which I interpret to mean, "repeats the prior second"; I've never watched to see what really happens) leap seconds.
> In these situations, the information displayed on a local clock (or stored in a Python datetime instance) is insufficient to identify a particular moment in time.
Does the datetime instance not store the timezone?
That said, I don't see the reason for this PEP. You should only store a value in the DB that doesn't change like that, and then when displayed you can change it to fit the local time of the user.
But once again this is one of those "assumptions about time that everyone gets wrong" just because we shift timezones in Britian doesn't necessarily mean that people in other countries do the same - they might just change what the timezone offset is and keep the timezone the same.
The only time that one has the time fold is when you turn the clock backward (let's just call that shifting from daylight savings to standard timezone). And this would only affect code which used wall clock time (time as it's read, e.g. 1:30am PDT), and would also only affect code which wanted to run something only once at a time within that fold (e.g. 1:30am ... not on both 1:30am's).
So, using the timezone method, just check to see if your current 1:30am is in your daylight savings timezone. Hurray! You're in the clear. Go ahead and do that thing you wanted to do only on the first 1:30am.
But the next day you're going to run into a problem. The only 1:30am you're going to get is in the standard timezone. So now you have to check for this timezone change only on the day of the change, which is yet another piece of data you have to keep track of. On the day of the change, do this timezone comparison, and on every other day don't worry about it.
When the clock hits your interesting time of 1:30am, just check to see if today is the day of the change, check what the current timezone is, check what the daylight savings time zone is, check to see if those those two values are the same, and now do your thing. Otherwise, just do your thing.
All of the above also ignores that people change times at different times (11pm, 1am, 2am, 3am), some don't change a full hour, and some don't change at all.
The proposal gets rid of all of that convoluted logic in everyone's programs, and instead it provides a single boolean value: is this the second time I've seen this time because of daylight savings shenanigans.
Does it? It doesn't cover the scheduling problem the other half of the year when the clocks move the other direction.
> So now you have to check for this timezone change only on the day of the change, which is yet another piece of data you have to keep track of.
If running a job twice is a problem, then why not check that the job has not already been run?
> is this the second time I've seen this time
Is this unambiguous? If it's 2015-10-25-01-30-00 GMT, have I seen that time before? In the UK, yes, in Mali no.
Why are you doing time math in local time?
Simply do all the time math in universal time and be done with it.
Dealing with time adjustments is the OS's job, not userland. If your job has to be scheduled exactly and cannot rely on the OS, and you refuse to deal with UTC, it's your own damn fault and you can always just use a long-running process with timers.
Time in UK is currently UTC +1 (BST) At 2am on 25 OCT we will return to GMT / UTC. It will therefore become 1am, and for the next hour all times will have happened before
The idea is to put a bit flag that says "alreadyseenthistime"
It seems to me this is a solution to the wrong problem.
Store all strings as bytes, assuming UTF-8, store all times as longs assuming UTC
If we convert all python datelines to non-naive (ie embedded with a TZ) then we are forced always to choose an encoding just like in strings. The right encoding is to always assume incoming dates are UTC, to throw error if they are non naive, and to assume that local clocks are set correctly (which we do anyway)
I need to read it more carefully - but it seems the wrong solution
https://developer.apple.com/library/mac/documentation/Cocoa/...