The Y2K bug did not become a crisis only because people literally spent tens of billions of dollars in effort to fix it. And in the end, everything kept working, so a lot of people thought it wasn't a crisis at all. Complete nonsense.
Yes, it's true that all software occasionally has bugs. But when all the software fails at the same time, a lot of the backup systems simultaneously fail, and you lose the infrastructure to fix things.
He also noted that a lot of American firms worked with Indian firms to resolve it. Indian engineers were well suited to the problem, according to him, because they had been exposed to a lot of the legacy systems involved during their studies.
He said an interesting consequence of this was that many American companies concluded they could use lower wage Indian engineers on other software projects so it helped initiate a wave of offshoring in the early 2000s.
I couldn't find the story I heard in the NPR archives, but they do have a number of stories reported right around that time if you want to see how it was being discussed and treated by NPR at the time:
https://www.npr.org/search?query=y2k&page=1
I did find this NPR interview with an Indian official from 2009 that notes Y2K's impact in US-Indian economic relations:
But I think the real turning point in many ways in the U.S. relations was Y2K, the 2000 - year 2000, Indian software engineers and computer engineers suddenly found themselves in the U.S. helping the U.S. in adjusting itself to the Y2K problem. And that opened a new chapter in our relations with a huge increase in services, trade and software and computer industry.
https://www.npr.org/templates/story/story.php?storyId=120738...
This created an exploitable opportunity for the few who knew how to write automated tests.
During the Y2K panic Sun Microsystems (IIRC) announced that they would pay a bounty of ~$1,000 per Y2K bug that anyone found in their software (due to the lack of automated tests, they didn't even know how to find their Y2K bugs)
James Whittaker (a college professor at the time) worked with his students to create a program that would parse Sun's binaries and discover many types of Y2K bugs. They wrote the code, hit run, and waited.
And waited. And waited.
One week later the code printed out it's findings: It found tens of thousands of bugs.
James Whittaker went to Sun Microsystems with his lawyer. They saw the results and then brought in their own lawyers. Eventually there was some settlement.
One of James' students bought a car with his share.
It was money for jam.
For example, at a former employer of mine, a big justification for getting rid of the IBM-compatible mainframe and a lot of legacy systems which ran on it (various in-house apps written in COBOL and FORTRAN) was Y2K.
In reality, they probably could have updated the mainframe systems to be Y2K-compliant. But, they didn't want to do that. They wanted to dump it all and replace it with an off-the-shelf solution running on Unix and/or Windows. And, for reasons which have absolutely nothing to do with Y2K itself (the expense and limitations of the mainframe platform), it probably was the right call. But, Y2K helped moved it from "wouldn't-it-be-nice-if-we" column into the "must-be-done" column.
> In reality, they probably could have updated the mainframe systems to be Y2K-compliant.
Legacy systems weren't always mainframe systems and, in any case a “legacy system” is precisely one you are no longer confident you can safely update for changing business requirements.
A change in a pervasive assumption touching many parts of a system (like “all dates are always in the 20th century”) is precisely the kind of thing that is high risk with a legacy systems.
And don't get me started on Oracle refusing to provide updated y2k compliant backports!
Sounds like some of the benefits from the Coronavirus crisis, like the increase in working from home.
Yes, updating JDEdwards AS/400 systems and many a PC was a big project, but i dont recall doing it at that time as being super difficult... frack ist so long ago i cant really recall many of the details other than reporting daily on the number of systems updated.
Also, fuck you google - this was when you were nascent a i converted the entire company to your “minimalist” front page vs yahoo, and a few years later you wasted 3 months of my time interviewing to tell me to expect an offer letter tomorrow and called to then reacind the offer (as i didnt have a degree) and then continuing to contact me for FIVE FUCKING YEARS for the same job that you wouldnt hire me for.
/off-my-chest
None of those people contacting you, or phone screening, were Google employees. They were, or worked for, contractors. It didn't matter to them or to Google whether anybody they contacted was hired.
The pressure was immense. People don't realise what a huge success story the Y2K situation really was.
Whole businesses would be ground to a halt to stem the mess of bad data and if not prepared they would not have quick fix.
In many cases in the age of apps and web apps we can roll out fixes fairly quickly, but in 2000 that was very much not the case for most situations.
So yes indeed stuff was broken, but it got fixed before the big date.
More importantly, lift systems are not sensitive to long duration dates. They do not need to be.
This story reflects one of the myths that emerged from the media. Around 1997 they started discovering the topic of embedded systems and that many devices, including lifts, contained "computers." Not understanding the restricted and specialized nature of embedded systems, they then claimed all these systems were vulnerable to Y2K, and that the whole world was about to crash.
Most embedded systems, in lifts and other devices, had been designed in the 1990s. If they used dates, they did not use mistakes from the 1960s.
This is especially important if you think about what things were like in the late-90s. As a teenage geek in Northern Ireland I read about the the first tech boom, but locally there was very little exploitation of tech and you were an anomaly if you had dial-up internet at home. There was very limited expertise available to fix your computer systems in the best of time.
Big companies had complex systems, but they could also afford contracts with vendors in the UK mainland, or Eire, to fix systems. It was much more limited for other companies. While people were generally not relying on personal computers too much, small businesses had just reached that tipping point where Y2K would have been painful. As a result, they took action.
I only got to visit the US a few times in the early-nineties and it seemed so futuristic (I got a 286 PC in '93 to use alongside my Amiga). I imagined the Y2K problem as being much more painful in the US, and I wasn't accounting for the distribution of expertise.
One great experience was summer training courses that the government had organised with a IT training company in Belfast. These were free and it was a like a rag tag army of teenagers and older geeks from both sides of what was quite a divided community. The systems we were dealing with were fairly simple. It was mostly patching Windows, Sage Accounting, upgrading productivity apps (there was more than Microsoft Office in use) etc., but the trainer was normally teaching more advanced stuff like networking and motherboard repair so we spent a lot of time on that.
I agree with everything you and the parent said, but there was also hype and hysteria on top of it all - especially among the communities of preppers, back to the landers, and eco-extremist types. I met entire communities of technophobes who declared I was just ignorant or 'a sheep' and refused to believe me when I tried to explain that the industry was working on the problem and had a solid shot at preventing major catastrophe.
Frightening predictions of power plants going offline for days, erratic traffic lights causing mayhem, food and water supplies being interrupted, even reactors melting down were not uncommon.
As always, there is money to be made by exploiting people's fear. This hysteria was considered an irrelevant side show by most educated people, the way I look at the chem trail people or the way I used to view the anti-vaxxers [1], but the unjustified hysteria was real.
[1] Before I learned their numbers were growing and could threaten herd immunity in some areas.
https://www.youtube.com/watch?v=x0bBYK-7ZiU
Oh, and Dick Clark melts.
And a few people did take it really seriously and spend the Millennium in some remote place where they would stand a chance of surviving the collapse of Western Civilisation.
this left me at a crossroads. i thought about writing malware that would randomly raise peoples semester grades by 4 points (e.g. C- would became a B+). i thought about mass changing grades. i thought about altering a target group of kids i didn't like. all but the last of the scenarios ended with me getting fingered because I was the smart computer kid. if i didn't touch my grades i would have plausible deniability. i wrote the malware. then i watched office space and decided to think about my actions (and promptly forgot as a horny teenager does). soooo glad i didn't release it because years later i went back into the code and found a leftover debug that would have targeted the only 2 kids in the school who had this letter in their last name.
tl;dr: office space is real
The sceptic inside me is curious as to how much they actually accomplished in their effort to fix it. I mean, yes, they spent billions to get millions of lines of code read, but how many fixes have been made and what would be the cost (when compared to those billions of dollars spent) if they weren't fixed at all.
With an auditorium full of others who had barely any programming experience, he was given a crash course in Cobol for a month, and then he had to go through thousands of lines of code of bank software that ran on some mainframe and identify all cases where a date was hard-coded as 2 characters.
Extremely tedious work, but there were tons of fixes to be made.
The problem was real, and the billions that were spent on fixing things were necessary.
Remember that big old systems were the low hanging fruit in 1970. The business logic was COBOL and assembler, without things like functions. The system may have hundreds of duplicate routines for things like subtracting dates.
If you asking "would it be better to have done nothing?" then I would have to say No, of course not. What are you thinking?
I worked on fixing Y2K bugs at the time. It was very methodically and planned, and tested on-site at a space agency. If we hadn't fixed those Y2K issues at the time then weather forecasts would have suffered greatly. The cost of that happening? Hard to calculate I think.
Why are you sceptical about this?
The fact that people worked on date issues for a decade does not mean the public scare campaign in the late 90s was justified.
You could equally claim airplanes would crash if they weren't refuelled, or people would die if garbage wasn't collected. In those cases, the media is educated enough to know the claims are pointless.
In the case of y2k, they had little understanding of how enterprise IT worked, and lots of businesses who were happy to encourage that ignorance.
A global retailer you've definitely heard of that used to own stores in Germany spent a lot of time preparing for Y2K. This was a long and painful process, but it got done in time.
But problems still slipped through. These problems ended up not being that big and visible because a large percentage of the code base had just been recently vetted, and a useful minority of that had been recently updated. Every single piece had clear and current ownership.
Lastly, there was an enormous amount of vigilance on The Big Night. Most of the Information Systems Division was on eight hour shifts, with big overlap between shifts at key hours (midnight in the US), and everyone else was on call.
As midnight rolled through Germany, all of the point of sale systems stopped working.
This was immediately noticed (the stores weren't open, but managers were there, exercising their systems), the responsible team started looking into it within minutes, a fix was identified, implemented and tested within tens of minutes, and rolled out in less than an hour.
Pre-Y2K effort and planning was vital; during-Y2K focus, along with the indirect fruits of the prior work, was also critical.
Fortunately, the guys that spent the previous year sweating the details got it right. We had one system that printed year 100 in its log which was only used by humans.
Unfortunately, I missed the party and cash the night shift got.
So that is well known. What's not well known is this: there are pockets of brilliant people elsewhere, including Bentonville, Arkansas.
Especially in the years running up to 2000, I had the pleasure and honor of working with some of the most talented people in the world, even though we were all looked down on by many people on the coasts.
We caught it early in development, but it was still a pain in the ass rebuilding all of the projects that depended on it.
We did multiple dry runs, setting server times to 31/12/99 23:55 and waiting 5 mins to see if anything died. Mostly it was OK, most of our systems were OK, but a few were dodgy and needed some changes.
The Big Night rolled around and we were OK. The planning had worked, and nothing fell over. I got the email on Jan 1st - all good.
Did banks go through some of the same growing pains
As sibling comments have noted: these systems are an exceedingly complex pile of separate but interdependent pieces.
The rollover from 31-Dec-1999 to 1-Jan-2000 had happened thousands of times before the actual day came. Some piece/assumption/etc had been overlooked.
It's the network of interacting systems that prevents this from being so simple.
There was a decent influx of older devs using the media hype as a way to get nice consulting dollars, nothing wrong with that, but in the end the problem and associated fix was not really a major technical hurdle, except for a few cases. It is also important to understand a lot of systems were not in a SQL databases at the time, many were in ISAM, Pic, dBase (ouch), dbm's (essentially NoSql before NoSql hype) or custom db formats (like flat files etc) that required entire databases to be rewritten, or migrated to new solutions.
My 2 cents, it was a real situation that if ignored could have been a major economic crisis, most companies were addressing it in various ways in plenty of time but the media latched on to a set of high profile companies/government systems that were untouched and hyped it. If you knew any Cobol or could work a Vax or IBM mainframe you could bank some decent money. I was mainly doing new dev work but I did get involved in fixing a number of older code bases, mainly on systems in non-popular languages or on different hardware/OS because I have a knack for that and had experience on most server/mainframe architectures you could name at that time.
At the time I was managing a dBase / FoxPro medical software package...we were a small staff who had to come up with Y2K mitigation on our own.
Our problem is we only had source code for "our" part of the chain...other data was being fed into the system from external systems where we had no vendor support.
Thus our only conceivable plan was to do the old:
If $year<10;
date="20$year"
else
date="19$year"
It worked in 99.9% of the cases which was enough for us to limp thru and just fix the bad cases by hand as they happened. Eventually we migrated off the whole stack over the next few years so stopped being a problem. I'm sure many mitigation strategies did the same.... date="200$year"
Or does foxpro somehow know to zero pad a 1 digit number?We'll be up for a few Y2K bugs at the start of every decade because 'year += year < n ? 2000 : 1900' was such an easy workaround, for n=20,30,40,...
https://www.pymnts.com/news/payment-methods/2020/new-years-b...
They did, last week, put in ... haproxy as an SSL terminator in front of the main server, and will test a switch over this week. This was 8 months of foot-dragging for about 3 hours of setup/config, and a couple more hours of testing. When all your clients are hitting a web server, and their browsers will all stop rejecting your certs, things will get ugly fast - as in "your business will effectively stop functioning". It just sounded like "doom and gloom" but... how do you message this effectively? It requires the receiving parties to actually understand the impact of what you're saying, regardless of terms you use.
The simple reason is that any one company may have been liable for a very small portion, which if it failed, would not have caused much trouble. But that failure combined with many other failures down the entire chain of connected and unconnected software would have added up to something much greater than the sum of parts.
We saw similar stuff happen with Flash, Silverlight (which wasn't as reported a concern since silverlight usage was so low, but I saw it within my company).
The media pressure was a significant reason why every company needed to have a plan to deal with it.
Because they make more money if people are scared and clicking refresh every few minutes.
I see this as a big problem because Y2038 is on the horizon and this "not a big deal" attitude is going to bite us hard. Y2K was pretty much a financial server issue[1], but Y2038 is in your walls. Its control systems for machinery that are going to be the pain point and that is going to be much, much worse. The analysis is going to be a painful and require digging through documentation that might not be familiar (building plans).
1) yes there were other important things, but the majority of work was because of financials.
1. Shine light on problem and make sure people hear about it
2. People respond and fix it
3. Outcome not as bad as you said it could be /because it was fixed/
4. Some time later "That wasn't a big deal"
No, it wasn't that big of a deal because we worked hard to fix it!
When people don't understand something, they can't tell the difference between a lot of effort to make something work smoothly and "it's clearly not that big of a deal"
Fuck me, as this is true
However arguably most dates in corporate IT work are involved in some level of forecasting and prediction and future planning.
In reality, starting in 1970 anyone writing an amortization table program for a 30 year mortgage had to work around Y2K. Anyone dealing in any way with the expiration date for a twenty year term life insurance policy had to start caring about Y2K in 1980. Even a mere net-30 business to business payment account either broke or not in nov-1999. Even on Dec 31 1999 it was hilariously charming how people all around the world thought all computers were located in THEIR timezone and thus any real time clock type failures would occur at precisely midnight local time where they live as opposed to where the computer is actually located. Due to the miracle of UTC time anything bad would have happened to our stuff early in the day while I was eating a late dinner, not when the operations center overstaffed during local timezone 3rd shift.
I was working at a telco at the time and we were very worried and overstaffed over Y2K, our stuff was fine, but we were pretty worried about rioters and such if anyone ELSE failed, like maybe the power co. Hilariously the power co people were probably overstaffed over Y2K, despite knowing their stuff was fine, they were likely worried about those telco goofballs failing thus losing SCADA links to their substations, LOL.
In the end it seems pretty much nothing failed anywhere, as I recall. Or the failure rate for that day, was no higher than any other average calendar date.
The OP can only be asking about comparisons to a hypothetical world where nobody dealt with it for critical systems.
That said, none of the bugs would have been critical to the operations of the services. Everything was in the billing systems and I think if unfixed it would have been more of a reputation hit than anything.
Also, "begs the question" doesn't mean what you think it means. https://en.wikipedia.org/wiki/Begging_the_question
> In modern vernacular usage, however, begging the question is often used to mean "raising the question" or "suggesting the question".
Unsurprisingly, humans are not good at accounting for black swan events, and even less so for averted ones.
Even if every other country has Y2K levels of success containing the Coronavirus, we can still point skeptics at the example of places like Italy to prove it was a real threat.
Things that are inevitable only when you encompass time spans longer than a human life (it has been approximately one and a half average human lifespans since the previous pandemic) may be predictable at that large aggregate scale, but on useful scales they are not. Or, to put it another way, if you've been shorting the market since 1918 for the next pandemic crash, you went bankrupt a long time ago.
Y2K is only a black swan for those not in the industry, since that one is obviously intrinsically timing based. The UNIX timestamp equivalent is also equally predictable to you and I, but to the rest of the world will seem even more arbitrary if it's still a problem by then. (At least Y2K was visibly obviously special on the normal human calendar.) But I wouldn't claim the term for that; call it a bit of sloppiness in my writing.
+ because it was going to fall.
- Are you certain?
+ yes.
- but it didn't fall. You caught it. The fact that you prevented it from happening doesn't change the fact that it was going to happen.
Minority Report , 2002
https://www.youtube.com/watch?v=IVGQHw9jrsk
People worked for years in the late 1990s replacing systems that were not Y2K compliant with new ones that were.
It is becoming ever more common to question the veracity of disaster averted through effort. And it is very dangerous.
No, it isn't. Questioning whether Y2K was overhyped started before Jan. 1, 2000 and accelerated on Jan 1, 2000 when there weren't major breakdowns. If you are too good at mitigating a problem before it manifests, there's a good chance lots of people will doubt there was a problem to mitigate. On the other hand, if it's a sui generis problem like Y2K, it's by definition too late for their doubts to impact mitigation efforts for the one potential occurrence, so it doesn't matter all that much. For a recurring problem, where those doubts can impact preparedness for the next potential occurrence, that's a bigger challenge.
However, I feel that this tendency is getting worse. A denial of the role of expertise. The question "was Y2K real?" is a political issue now, and it's not because of Y2K specifically, it's as a comparison to more recent events.
People were talking about Y2K years ahead of time. Lots of changes to code were made. A few little bugs slipped through, but not many, and everyone knew how to fix them. No crisis. Without the many code changes, big problem.
In the end you end up in this perverse situation where you have to wait long enough for the public to understand what's at stake but not long enough that you won't be able to keep things under control. Quite the tightrope trick.
As someone who was forced to spend Y2K in a "prepped" cabin on the side of a mountain with two years of supplies buried underneath, I think you might overestimate the quality of the public's response to Y2K.
The public did not maturely understand that software needed to be updated and everything was OK.
There was some real panic out there. It was arguably the biggest "End Times" event of the modern era, definitely IMO surpassing "2012" and other "apocalypse panics".
The Y2k preppers and panic, I think, was the foundation for the modern prepper movement and the public's desire to flip from conspiracy to conspiracy to predict collapse.
I lived and worked as a software developer through the Y2K "crisis" (although I wasn't working on solving the crisis myself). Everyone was very worried about it. Nothing really went wrong in the end.
Was that because there was no problem? Or because everyone was worried about it and actually solved the problem? I don't think it's easy to tell the difference really.
That said, if we need to jump start the economy again, maybe we could come up with a fake problem this time. How about we say that 2025 will be a huge problem for many critical IT systems? Or does a timestamp does produce a nice round number in the near future that is a good sell?
edit: There is this: https://en.wikipedia.org/wiki/Year_2038_problem
From the customer's perspective, they would lose the use of their billing credit card for typically a day while until the charges were reversed. This was less of an issue in 2000 than today as far fewer regular payments happened via credit card, but would still be a major disruptor.
Every system, every piece of hardware - both in the data centers and in the hospitals - had to be certified Y2K compliant in enough time to correct the issue. As I recall, we were trying to target being Y2K ready on January 1, 1999 but that date slipped.
A "Mission Control" was created at the Data Center and it was going to be activated on December 15, 1999, running 24 hours a day until all issues were resolved. Every IT staff member was going to rotate through Mission Control and every staffer was going to have to serve some third shifts too.
I left Columbia/HCA in June, 1999 after they wanted to move me into COBOL. I had no desire to do so and I took a programming position with the Tennessee Department of Transportation.
I remember my first day on the job when I asked my boss what our Y2K policy was. He shrugged and said "If it breaks, we'll fix it when we get back from New Year's".
What a difference!!!
I'm a little surprised. TDT is in a critical business too (transportation).
Their billing and management system was written in COBOL, and contained numerous Y2K bugs. If we did nothing, then the entire billing system would have collapsed. That would mean Welsh people either receiving no bills, or bills for >100 years of gas/water supply, depending on the bug that got triggered. Very quickly (within days) the system would have collapsed, and water/gas would have stopped flowing to Welsh homes.
Each field that had a date in it had to be examined, and every single piece of logic that referenced that field had to be updated to deal with 4 digits instead of 2.
I wasn't dealing with the actual COBOL, I managed an Access-based change management system that catalogued each field and each reference that needed to be changed, and tracked whether it had been changed or not, and whether the change had been tested and deployed. This was vital, and used hourly by the 200+ devs who were actually changing the code.
We finished making all the changes by about December 1998, at which point it was just mopping up and I wasn't needed any more. I bought a house with the money I made from that contract (well, paid the deposit at least).
The cost was staggering. The lowest-paid COBOL devs were on GBP100+ per hour. The highest-paid person I met was on GBP500 per hour, enticed out of retirement. They were paid that much for 6-month contracts, at least. Hyder paid multiple millions of pounds in contract fees to fix Y2K, knowing that the entire business would fail if they didn't.
Still less than the cost to rewrite all that COBOL. The original project was justified by sacking hundreds of accounts clerks, replaced by the COBOL system and hardware. By 1998 the hardware was out of date, and the software was buggy, but the cost-benefit of a rewrite made no sense at all. As far as I'm aware Hyder is still running on that COBOL code.
Except we didn't:
https://en.wikipedia.org/wiki/Year_2000_problem#On_1_January...
During the Y2K panic Sun Microsystems (IIRC) announced that they would pay a bounty of ~$1,000 per Y2K bug that anyone found in their software. As you noted, there was very little automated testing at the time so these problems were really hard to discover.
James Whittaker (a college professor at the time) worked with his students to create a program that would parse Sun's binaries and discover many types of Y2K bugs. They wrote the code, hit run, and waited.
And waited. And waited.
One week later the code printed out it's findings: It found tens of thousands of bugs.
James Whittaker went to Sun Microsystems with his lawyer. They saw the results and then brought in their own lawyers. Eventually there was some settlement.
One of James' students bought a car with his share.
The answer to the first question is yes. There was a potential problem. However the companies and government departments that were affected had started planning in the early 90s, and they prepared during the decade. Many took the opportunity to embark on huge system upgrades. It was just one of many issues CIOs dealt with.
The answer to the second question is no. The huge disaster scares were not justified. Banks, airlines, insurance companies and government departments had already fixed their systems, just like they fix other problems.
What happened was that consulting companies, outsourcers and law firms suddenly realized there was a huge new market that they could scare into being. They started running campaigns aimed at getting work from mid size businesses.
The campaign took off because it was an easy issue for the media and politicians to understand. It also played into the popular meme that programmers were stupid. The kicker was the threat that directors who failed to prepare could be sued if anything went wrong. Directors fell into line and commissioned a lot of needless work.
In summary, there was the professional work carried out by big banks, airlines etc, generally between 1990 and 1997, and the panic-driven, sometimes pointless work by smaller firms in 1998 and 1999.
I can point to several huge companies who did nothing until 1998 or even 1999. The media scare helped with that (priority, money) a lot.
Because of all the preparation and upgrades being done, I think only incident we had when Y2k migration manager sent out "all clear"-email after rollover - Unix mail client he used formatted date on email as "01/01/19100" - though I suspect he knew of the issue and didn't upgrade on purpose just to make a point.
Airplanes were probably never going to fall out of the sky at the stroke of midnight, but I personally fixed tons of bugs that had potential impacts in the dozens of millions of dollars.
I do realize it could've been a lot worse if it were not thanks to the many efforts of people in the tech industry.
And in case you wonder: I would bet the same is going to happen with regards to 32-bit and 2038.
The Indian IT outsourcing industry was effectively created by the Y2K bug. Those companies did a large amount of the bug fixing.
They're still working on it now, and not fully prepared.
The BSDs fare much better on that, as most of them have done this a long time ago.
Compare that to the CFC situation in the 80's. Scientists agree that the mitigating actions we took saved the ozone layer. Or compare it to the current global warming crisis. Scientists tell us that if we do nothing, we will suffer catastrophic climate change.
Media never tells you the truth, but the scientists usually do. So you listen to them.
We basically inventoried every unclassified computer system on the base. If it was commercial, off-the-shelf software that could be replaced we recommended they replace it. If it could not be replaced with newer version (because it ran software that could not or would not be replaced) we replicated and tested it by changing the computer hw clock. In all cases we recommended shutting down the computer so it wasn't on during the changeover.
Most home-grown systems were replaced with commercial software.
One interesting case was a really old system, I think it had something to do with air traffic control. It was written by a guy who was still employed there and he was still working on it. I got to interview him a bunch of times and found the whole situation fascinating and a little depressing. Yes, he was storing a 2-digit year. He didn't know what would happen when it flipped. He didn't feel like there was a way to run it somewhere else and see what would happen (it's very difficult to remember but I think it was running on a mainframe in the comm squadron building).
The people in charge decided to replace it with commercial software. Maybe the guy was forced to retire?
Overall the base didn't have any issues but only because they formed the "y2k program management group" far enough ahead of time that we were able to inventory and replace most everything before anything could happen.
Then came 2000-Feb-29 and it happened, I had a risk management system hosted out of the UK that just didn't work. Had to file the system failure through to internal global management and domestic regulators.
I was thrilled. First because that system owner had refused to conduct global integrated testing so I could blame the SO. Had the request, negotiation, and finally the outright refusal in writing. The failed system was relatively trivial domestically. Risk wasn't calculated one day on a global platform that and that risk didn't hit my local books. Ha ha sucks to be you. Most importantly, I was thrilled because I could point to the failure and say "see, that is what would have happened x100 if we hadn't nailed the project." It was a great example for all the assholes who bitched about the amount of money we spent.
Basically all of our software was written in COBOL, and most COBOL data is processed using what we'd consider today to be string-like formats. And to save space (a valuable commodity when DASD (aka hard drives) cost hundreds of thousands of dollars, and stored a few megabytes of data) two-digit dates were everywhere.
I started in 1991. The analysis had been done years before, and we knew where most of the 2-digit problems were, so it was just a matter of slowly and steadily evolving the system to use 4-digit dates where possible, or to shift the epoch forward where that made sense.
Every few months we'd deploy a new version of some sub-system which had changed, migrate all the data over a weekend, and cross off another box in the huge poster showing all the tasks to be done.
External interfaces were the worst. Interbank transfers, ATM network connections, ATM hardware itself, etc, etc. We mostly tried to switch internal stuff first but leave the APIs as 2-digit until the external party was ready to cut over. Similarly between our internal systems: get both ready internally, migrate all the data, and then finally flick the switch on both systems to switch the interfaces to 4-digit.
Practically, it meant that we our development group (maybe 30 people?) was effectively half that size for 5 or 6 years in the early 90's as the other half of the group did nothing but Y2K preparation.
All of these upgrades had to be timed around external partners, quarterly reporting (which took up a whole weekend, and sometimes meant we couldn't open the branches until late on the Monday after end-of-quarter), operating system updates, etc, etc. The operations team had a pretty solid schedule booked out years in advance.
We actually had two mainframes, in two data centers: one IBM 3090 and the other the equivalent Armdahl model. We'd use the hot spare on a weekend to test things.
It was a very different world back then: no Internet, for a start. Professional communication was done by magazines and usergroup meetings. Everything moved a lot slower.
I left that job before Y2K but according to the people I knew there, it went pretty well.
There's a very current equivalent - if we're good about social distancing, people may talk about COVID-19 the same way.
Just because it didn't happen doesn't mean it couldn't have.
Even in the late 80's I had to argue with some colleagues that we really shouldn't be using two-digit dates anymore.
I worked with 80-column punched cards in the 70's, every column was precious, you had to use two-digit years. When we converted to disc, storage was still small and expensive, and we had to stay with two-digit years.
First, enormous amounts of money was spent on repairs to the extent that they could be done. I know of some 50-year-old processes that didn't have the original source any longer. Significant consultant time was used in what at times resembled archeology.
Second, there was a little downturn in new projects after the turn, as budgets had been totally busted.
There was one consultant who preached doom and gloom about the collapse of civilization when that midnight came. He went so far as to move his family from NYC to New Mexico. He published on his web page all sorts of survivalist techniques and necessities. When the time came, his kids, who apparently didn't share the end-of-the-world view, woke him up and said "Dad!! New Zealand is dark!!!" but of course it wasn't.
The lesson there was that there was a tunnel vision about exactly how automated stuff actually was. While there were enormous systems with mainframes, Sun servers, workstations doing all this work, what the tunnel vision brought was the perception that excluded the regular human interactions with the inputs and outputs and operation of these systems. Not so fully automated after all.
There were a few disasters--I remember one small or medium grocery chain that had POS systems that couldn't handle credit cards with expiration dates beyond 12-31-1999 and would crash the whole outfit. The store was unable to process any transaction then until the whole thing was rebooted. They shortly went out of business.
This was victim of its own success: since the work was largely completed in time nobody had the huge counter-example of a disaster to justify the cost. I'm reminded of the ozone hole / CFC scare in the 1980s where a problem was identified, large-scale action happened, and there's been a persistent contingent of grumblers saying it wasn't necessary ever since because the problem didn't get worse.
There were a lot of two-digit dates out there which would have led to a lot of bugs. Companies put a lot of effort into addressing them so the worst you heard about was a 101 year old man getting baby formula in the mail.
The media over-hyped it, though. There was a market for books and guest interviews on TV news, and plenty of people were willing to step up and preach doom & gloom for a couple bucks: planes were going to fall out of the sky, ATMs would stop working, all traffic lights were going to fail, that sort of thing. It's like there was a pressure to ratchet things up a notch every day so you looked like you were more aware of the tragic impact of this bug than everyone else.
That's the part of the crisis that wasn't real, and it never was.
Leading up to the change over there was a lot of work to make sure all the systems would be OK, and that underlying software would also be OK, but keep in mind, auto-update on the Internet wasn't super common.
I ended up getting one call from a customer that night where they had a valid Y2K bug in their software, and since it wasn't in Red Hat's system, they moved along to their next support person to call :)
It was a thing, but much less of a thing because of the work put into getting ahead of it.
Also one of the major results of the Y2K bug, IT department finally got the budgets to upgrade their hardware. If they had not gotten newer hardware I am sure there would have been more problems.
Finally, in my area the main reason companies failed from IT problems is because of problems with their database, but it turns out their backup are not good or have not been done recently. Many companies tried to be cheap and never updated their backup software, so even if they did backup their data the backup software could really mess things up if it used 2 digit dates to track which files to update.
Things go very bad if you lose Payroll, Accounts Payable or Accounts Receive-able.
Some historians seem to think that it was a real crisis in which the US pioneered solutions that were used across the world: https://www.washingtonpost.com/outlook/2019/12/30/lessons-yk...
The panic was also very real despite not being proportional to the actual problem, but just like any other media-induced widespread panic, it served as a means to make lots of profit for those in a position to do so. Media companies squeezed every last drop of that panic for ratings... well into the year 2000 when they started spreading the story that Y2K was the tip of the iceberg, and the "real" Y2K won't actually start until January 1, 2001.
As an immigrant to the US, I got to see the weird side of American culture in how people tend to romanticize (for lack of a better word) post-apocalyptic America. Kind of like the doomsday hoarders of today are doing. It's like they think a starring role on the Walking Dead is waiting for them, except in real life.
For the place I worked at (large international company) it was a G*d send opportunity. All the slack that had been build up in the past by "cost reducing" management suddenly had a billable cost position that nobody questioned.
Of course there where some actual Y2K issue solved in code and calculations, but by large the significant part of the budget was spend on new shiny stuff, to get changes approved and compensate workers for bonuses missed in the previous years.
We had a blast doing it, and the biggest let down while following the year roll over from the dateline and seeing nothing like the expected and predicted rolling blackouts.
YES. It was real.
I was finishing an engineering degree (CSE) in 1992 and several of my peers took consulting jobs to work on Y2K issues. For nearly a decade a huge amount of work was done to review and repair code.
Y2K is the butt of many jokes, but the truth is: it didn't happen because the work was done to fix it. Sort of ironic.
However, one issue I did run into nearly two years later was when UNIX time_t rolled over to 1 billion seconds. The company I worked with at the time was running WU-IMAP for their email server, plus additional patches for qmail-style maildir support. We came into work on September 10th 2001 and all the email on our IMAP server was sorted in the wrong order.
Turns out there was a bug in the date sorting function in this particular maildir patch (see http://www.davideous.com/imap-maildir/ - "10-digit unix date rollover problem"). I think we were the first to report it to the maintainer due to the timezone we were in. First time for me in identifying and submitting a patch to fix a critical issue in a piece of open source software! My co-worker and I were chuffed.
Of course, we swiftly forgot about it the next day when planes crashed into the NY World Trade Center.
But that couldn't be more wrong.
Keep in mind that it was also used as a significant contributing factor to replace a lot of major legacy IT systems (especially accounting systems) at big organisations (a lot of SAP rollouts in the late 90s had Y2K as part of the cost justifications).
The company I worked for ran a Y2K Remediation "Factory" for mainframe software - going through and change to 4 digits, checking for leap year issues, confirming various calculations still worked.
I worked on a full system replacement that was partially justified on the basis of (roughly) we can spend 0.3x and do y2k patches, or spend X and get a new system using more recent technologies and UIs.
There were still problems, but they were generally in less critical systems as likely major systems had been tested, and were remediated or replaced.
Keep in mind that there was often much more processing that occurred on desktop computers (traditional fat client) - so lots of effort was also expended on check desktop date rollover behaviour. Once place I worked at had to manually run test software on every computer they had (10's of thousands) because it needed reboots and remote management was more primitive (and less adopted) at the time.
I worked in QA in one of their bank teller application development branch offices, so all I did for weeks was enter in date times between 99 and 00 into the software and test that the fixes were successful.
The unique thing about Y2K was that the problem was well understood and came with an actual deadline, so you could project manage around it.
Any normal bug couldn't be project managed this way, and you can't just throw interns at regular problems, whereas with Y2K, if you had the money, you can just assign people to look at every line of code to look for date handling code.
In the UK, there were some medical devices (my memory says dialysis machines) that malfunctioned over the issue.
There is an important lesson about the behavior of the media in this. They whipped out people into a survivalist, doomsday prepper frenzy over an issue that could be solved simply by updating BIOS, software, and/or hardware.
With that said, the effort was very expensive because so much software and hardware needed to be audited at every company.
Nobody. None.
Everybody got a good landing in the pilot's sense of a good landing being one you walk away from. Think of your boss, all the people you work with and ever have. And they all suceeded.
So crisis, no. No way everybody pulls it off if it were a real crisis. But damn it made sales easy by consultants to all of the above who spent big. "Planes will drop out of the sky!"
A very powerful way to sell is through fear. We got sold the iraq war on weapons of mass destruction that could kill us in our beds here! And this was used and abused by consulting firms to make sales to managers and boards of directors who have no clue what a computer is and what it does and think hackers can whistle the launch codes etc. That fear based sales job happened on mass and was a vastly bigger phenomenen than y2k. But having said that there were so many people who bought the fear sales job that employed them that they still believe it. Many will post here about it and you can weigh it all up for yourself.
So yeah there were y2k issues, some got dealt with in advance, some didn't but nothing like the hype of 1999. Nothing like it.
So, it's a mix. It was real, it was well-handled, but there was also some hype, and even some hype that served a real (covert) good purpose.
He spent New Years in a DC of a big financial firm in NYC. Apparently the firm was so worried about a failure they shelled out big bucks to have UPS maintenance staff onsite during the cut-over "just in case".
The potential for large disruptions in financial, real-time and other systems would have occurred if not for the effort applied.
Unfortunately some problems require a certain level of media-awareness and/or hysteria before we devote the necessary resources to fix the problem before it become a crisis
Just like the crisis we are currently facing in our health systems, it seems unlikely that we would have had enough IT resources to deal with the issues in real-time.
This is one of the cases of a "self-denying-prophecy", much like acid rain. There was an issue, we collectively dealt with it (better yet, we actually anticipated!), and now people are saying that in the end there was no issue.
https://www.bbc.com/future/article/20190823-can-lessons-from...
Of course though, because we had spent the previous few months setting clocks forward to see what broke, and fixing it.
One of them, was running calculations 24/7 for a research group at the university and fortunately they were able to stop the jobs in time for an OS upgrade.
const DATE_COMPUTERS_DID_NOT_EXIST = /* arbitrary */;
/* snip */
if (Date::now() < DATE_COMPUTERS_DID_NOT_EXIST) {
Computer::selfDestruct();
}
(See also: the Simpsons Y2K episode, which I think is a good representation of what many non-tech people believed would happen.)I think it's a great lesson in the failings of the public imagination and should serve as a warning to not give into moral panics.
Pre-Y2K I worked to fix loan systems that would have failed had their Y2K bugs not been fixed. Not getting a loan isn't accidentally launching a nuclear missile, but it affects your credit score and stops you buying a car.
Enough of this kind of failures would have had a severe effect on the economy, up to and including causing an economic crisis.
I looked at every system and decided the fix and coded it up from Delphi, Access, SQL, VB, QBASIC and c++.
It was quite enjoyable and I was enjoying a glass of wine and a steak on the dreaded evening, which was a Friday. Not a single phone call until Monday morning when my boss said take a few days off but pay attention to my pager. I put it in the refridgerator :-)
Big companies (banks, insurance, health care) had elaborate contingency plans.
There were some failures, but nothing to disrupt life for 99.9% of the population, unless you call a website that says it's
January 5, 19100
a failure.There have been other problems. Day 10,000, but VAX and Unix systems, some programs had problems, once again mostly cause they didn't allow for the longer strings.
Climate sceptics often use these as an excuse. "Yes but there was sooo much hubbub about those and it proved to be nothing".
Well, yes it is nothing now because it was decisively and effectively handled. The ozone layer is still recovering but on a steady path there, and acid rain is reduced to the point of not really being an issue anymore (at least in the western world!).
Stuff really would have gone wrong with Y2K. Maybe not armageddon but yes. There was a problem.
Just a public service announcement that the decisions you make today for good reasons can, in retrospect, be seen as a huge mistake. The future view of your decisions will always have better knowledge than you have when you make that decision.
I remember visiting a smaller hotel in the UK early 2000 where the check-in terminal had a Y2000 Approved sticker with a serial number. That made sense, but in the room everything with a plug, including the hair dryer had such a sticker.
The media did sensationalize it with stuff like "Planes falling out of the sky!" but there would have been massive disruptions due to systematic date/time issues. Tons of money was spent testing systems to ensure they're Y2K compatible and if not these systems were either patched or hastily replaced with something that was.
I recall being in Seattle on New Years Eve and there were cops everywhere in full battle dress with armored personnel carriers and nobody was out partying like it was 1999, which was a shame because the weather was unusually mild.
this is basically ALL OF LIFE in IT OPERATIONS. lol.
It's certainly true an absurd amount of resources went into 'fixing' the problem.
Apply to this to any crisis du jour-- drugs/terrorism/climate/viruses etc...
Never let a Good Crisis go to waste.
I could see why people would be worried that banking software written in COBOL in 1983 would break and had to spend significant sums of money making sure it didn't. Since it was an extremely predictable problem with a specific, known, non-negotiable deadline everyone who had money to lose if there were a problem had plenty of time and incentive to prevent it.
(Shameless plug, here is my humorous take on it)
All I know is, I plan to retire by age ~57 (before the 2038 bug hits :) )
But actually most software uses epoch time or something similar. So the scope of the problem was much smaller than the news implied.
The trick yo avoiding predictable crises is to actually doing something before it happens in order to avoid it.
Yeah, it was a big deal. Pretty much all dev work was done by me and one other guy. How much dev work could a school district have back then? Oh, lots. Every school, and in some cases individual educators, would send in requests for various reports, and each one required configuring a mainframe job, running it, and doing some kind of thing with the output (conversion to a "modern" format on a 3.5" disk or printing it or something). Every payroll cycle required a lot of manual labor, every report card cycle, there were daily student attendance jobs, and this particular district had a rather advanced, for the time, centrally-managed network across the entire district with Solaris DNS.
So on top of all this regular workload, we had to go over pretty much every single line of COBOL in that mainframe, visually, and search for Y2K-related bugs. There were many. The Solaris box needed to be patched too, and the first patches that came out weren't great and I didn't know what I was doing yet either.
So we started on this in earnest in Summer of 1997, while everyone was out of school. We ran a lot of test jobs, which involved halting all regular work, monkeying around with the mainframe's date, and then running a payroll job to see what blew up. By late 1999, my mentor there was pulling multiple all-nighters. He had a family of his own too and it really impacted his health.
There were mountains of greenbar printouts in our little office, all code, with bright red pen marks. Such was life when working on a mainframe at the time. The school district also brought out of retirement the guy who had written much of the key operating system components for our mainframe. I believe he came on as a consultant at rates that would be pretty nice even by today's standards.
In the end, school restarted after winter vacation and most things ran okay. A few jobs blew up where we had missed something here or there, but everyone by then had got sort of accustomed to the chaos and it just needed a phone call to say, "it broke, we're working on it, call you tomorrow".
Rough estimate, there was probably over a thousand hours' worth of labor to fix it all. Had that not been done, virtually nothing in that school district would have worked correctly by the beginning of 2000. (Some things started failing a month or two in advance, depending on the date arithmetic they needed to do.)
And these weren't just "year 100" printer errors; a lot of things simply barfed in some fancy way or another, or produced nonsense output, or -- in the most fun cases -- produced a lot of really incorrect data that was then handed off to other jobs, which then produced a lot of even more incorrect data, and then saved it all to the database.