I looked up Pinedo's background and she's not a developer; she's a social media manager. This is kind of what I figured because her perspective on development seemed really out of whack to me. There are many kinds of bugs that can't be measured with a simple stability calculation, and IME there are definitely error states that are worse than death (crashes). Plus 97% of teams are definitely not following agile principles. Every dev team I've ever been on said it was agile, and most of them just meant that there was a kanban board and something that vaguely approximated a sprint.
And I'm more for prioritizing trying to not introduce bugs than to fix all the old ones. Which is challenging on, how could we call that?, "legacy" software. So that priority can and must be reversed temporarily when that "legacy" is too much.
So it's all very context dependent, and not having anybody (or too few) working on making things better when its needed is not going to deliver any kind of velocity in the long term (and probably the short term velocity is already way too low in those cases). Too bad for the mythical time to market...
So you have to be able to say no to bugfixes, but you certainly also have to be able to say no to the eternal rush of new half-backed features, when needed. A short-term ever obsession on "opportunity cost of not working on a feature" could yield quite paradoxical results if trying to build them on some kind of zombie legacy code (that is only ever edited with disgust and great difficulty, but never seriously refactored).
Not only this balance is hard to achieve, but your role as a senior tech lead and project manager is certainly to consider carefully the cleanup needs, and be an advocate for them when needed, including by pushing back against feature creep pressure. Because if you are not, most of the time nobody else will. As a tech lead, this mean among other things, that a black box approach of parts of the maintained software is out of the question (of course you can delegate, but even then its imperative to stay in the equation for that purpose, only with less details). Paradoxically, even if the quality is crap and the organization notices and tracks loads of bugs, most people will be happy at the moment the bugs are triaged and assigned and eventually "fixed" by more horrible garbage (that is, the impression that something is done), rather than doing the right thing that is to organize a cleanup of the software more in depth.
I've got the impression that it is rare to find projects where this balance is achieved correctly, but maybe it's only because of bad luck. Well in lots of cases, the famous ones (I'm thinking on the level of Linux, Firefox, Python, etc. not just your random niche software) are actually not that bad, and their competitors have a way shorter lifespan when not as balanced...
One of the toughest things I struggled with while transitioning from a
larval junior developer to a senior tech lead to a project manager was the
fact that (at least in the context of a for-profit business) not all bugs
need to be fixed, even the ones you personally think are really really
bad. The goal is to make money, not necessarily by producing the most
perfect software.
On the other hand, you have to remember the that impact = risk x loss. And developers and managers are notoriously bad at evaluating the risk posed by a bug. It does no good for software to make $10 millon dollars a year for 5 years and then make a $100 millon dollar loss in the sixth, because of a catastrophic bug that no one prioritized fixing because, "It's been 5 years and no one's run into this bug yet."There is also the matter of really picky customers; however, if a customer can articulate what you are doing wrong, I don't think that it's a bad thing if they are picky.
Quality should be in step with expectations; selling services that can't be achieved with in a specific time slot is worse than trying to fix all of the bugs.
You also should probably be fixing bugs that "If we're caught out by those third parties before the bugs are fixed, there's a good chance it could sink the whole company"
The overall point here isn't to use this as an excuse to not fix bugs, more that you should consider applying the same logic as devops/sre teams use for "uptime and availability" (5 nines, etc) to software stability to help you move faster as a company.
They may or may not be right, but you can't just dismiss things you don't like by classifying them as a "fallacy."
If they're not noticeable, and the customers keep buying, despite 100% of them having them, how are they "important"?
>The business relies on complying with the rules of third-party organizations and the software is blatantly violating those rules right now.
That's a business decision. It could even be illegal (but still not big of a deal depending on circumstances) but not a bug.
Where I come from, if software doesn't meet requirements and instead produces incorrect results, it's considered a bug. To use another example from a different company I've worked for, I once found out that the financial reports generated by a particular piece of software were all completely wrong due to errors in the way calculations were done. I pointed this out to the higher-ups and they agreed that there was a bug and that all the existing financial reports were in significant error. They refused to let me fix it, not because we didn't have time (there was plenty, and I was otherwise free to work on pretty much whatever was in the backlog), but because fixing the bug would let the customers know that there was a bug in the first place. Some of this data would end up getting passed on to shareholders and the government. Is this not a bug? In the current case, the reason we're not complying with the rules was because of a bad architectural decision that wasn't properly cleared with anyone before it was implemented.
I guess I'm just not on board with the "just following orders" school of software development when the negligence involved rises to the level of illegality and scams. I'm happy to write a lot of software that I personally think is wonky or strange, but for me it stops when we start ripping people off and breaking the law.
Bugs aren't magic; they happen for a reason. It could be a broken dependency(unsupported versions, fatal bug in a dependency, deprecation, configuration etc.), resource limitation(out of memory, security breach etc.), poor design which leads to poor implementation(logical errors, bad data abstractions).
Abstractly waving your hands and saying "we can't fix all bugs" doesn't feel right. Identify the underlying cause of the bugs and address that.
One solution is to reduce dependencies, increase resource allocation, and rely on a less rigid design. As a business grows, dependencies will increase, resource allocation will increase and the design will become more complex.
So, obviously, as engineers we have to say that it's context dependent - bugs have priorities. And sometimes bugs exist that you can't reproduce in a lab, have only occurred once in history, and you can't even be sure it wasn't some hardware glitch (because, well, hardware is buggy too)... So, the sane and reasonable thing to do is to let those go and spend our time somewhere where we're likely to make considerable and reasonable progress.
https://blog.acolyer.org/2017/05/29/an-empirical-study-on-th... says that they found 16 bugs in three formally-verified systems (including two bugs that didn't get caught because of bugs in the verifier). So, I'm pretty sure that bugs can happen. (Unless you mean that in certain critical systems, bugs can't be allowed to happen, in which case I agree.)
More, I'm pretty sure that most bugs don't happen for the reasons you list. I suspect that the majority of bugs are just poor implementation.
It's possible, however, for the opposite extreme to happen: The program wasn't buggy, but circumstances changed, and now it is. I'm thinking specifically of crypto code, which can be perfect... until a new attack is devised. Then the software is buggy, because it can't stop an attack that didn't exist when it was written.
When we do get software that has "no" bugs in critical systems, it's because of extreme care at every step: specification, design, implementation, review, and testing. Obsessive testing, and testing, and testing, and testing.
A random search tells me that "The mean DD for the studied sample of projects is 7.47 post release defects per thousand lines of code (KLoC), the median is 4.3 with a standard deviation of 7.99." ( https://ieeexplore.ieee.org/document/6462687/ )
So clearly if you are careful and use state of the art practices, this is very doable.
Not only this is doable, but various individuals and teams in history have been able to reach way lower defect densities. Hey, for all practical purposes, TeX is bug free, for example.
If you are not able to write 100 lines of useful code without a bug in it (not in an infallible way, but at least sufficiently often enough), maybe you should simply study and practice to get that ability.
Yeah, they all SAY they practice agile, but a lot of them practice waterfall with agile naming conventions.
Saying that "97% of organizations practice agile development methods" makes it sound like it's overwhelmingly dominant, but it's not even possible to tell from these responses if a plurality of the teams who responded to a "State of Agile survey" use it, or what the most common development methodology is.
This is exactly the sort of contextless figure that you'll find torn apart in "How to Lie with Statistics".
> "JavaScript is more popular than ever, and over 69% of developers use JavaScript
Does anyone really believe that StackOverflow surveys gather a representative sample of all developers? This same survey found that 1 in 6 developers target the Raspberry Pi (more than iOS), and 7.5% use assembly language (more than Go, Objective-C, or VB.NET). It's an interesting survey but take it with a grain of salt.
Startup I'm working at right now told me "Agile" is overkill for us. But ironically it's more iterative then any company I worked before that did sprints.
If it's a bug that 10 users are hitting because they were migrated from an earlier version incorrectly, sure it might be okay not to fix, but if 10 users are hitting it because they're legally blind and using an extraordinarily large font to use your product, it's crappy to say they don't deserve a fix. You have to understand what part of your userbase is hitting a bug and then decide from there.
Google started telling me some months ago that my browser is unsupported and random stuff has stopped working every few weeks since. I'm running circa 2015 Safari.
OT: does anybody know of a site with similar interesting content and discussion to HN, but with a fraction of sociopaths closer to that of the general world population?
https://www.nytimes.com/2018/05/04/books/review/automating-i...
This is a stretch. Seems like many people think of code as a living thing that just does what it wants, and us programmers have to beat it into submission.
The truth is, there can be bug free applications.
The problem I think is the complete opposite of the point of the article. Programmers need time to write good software. Without stopping to fix the things we run into, technical debt does what it's known for, and exponentially increases, and kills time that could be spent writing features.
So maybe software is like a living being in a way, that it needs to be cared for gently.
It's about time, it's about having managers that were programmers and not just managers and so on. Bug free is possible for sure.
It makes sense to consider the cost of having a given bug vs the cost of fixing it. Of course, such estimates will almost always be hand-wavey.
I would also say that when in doubt, fix it. A bug is, by definition, the software not doing what it's expected to do; I think it's better to make fewer promises and keep them. A user who encounters a bug loses trust in the software, and there's a tipping point where they abandon it. You might not know where that is.
You also might not realize what a bad day it could give someone, even if they're only one person. Eg, if you're an email platform and you have a bug that drops one email in a million, that might seem OK. But if missing that email gets someone evicted...
It makes sense to consider the cost of having a given bug vs the cost of
fixing it. Of course, such estimates will almost always be hand-wavey.
I agree with that approach in theory, but in practice it turns out that that it's a lot easier to estimate the costs of fixing the bug than it is to estimate the cost of having the bug. As a result, because of our biases, in any ambiguous case, our bias will be for keeping the bug, since the cost of having the bug is the impact of the bug multiplied by the probability of someone hitting it, and it's always easy to lowball those probabilities. "Oh, no one will notice that," or "Yeah, but that's a really obscure case." And then you find out that all it takes is one obscure case for your trading application to lose hundreds of millions of dollars a day. Or for hackers to breach your systems and make off with millions of credit card numbers. Or for malware to turn your IoT devices into a botnet.While I may agree with this in the abstract, in practice most folks don't really know whether they're at that point. It also doesn't consider cumulative effects over time.
Bugs don't just affect application stability or user experience. A system that does not behave as designed/documented/expected is a system that will be more difficult to reason about and more difficult to safely change. This incidental complexity directly increases the cost of building new features in ways difficult to measure. Further, new features implemented by hacking around unfixed flaws will themselves be more difficult to reason about and more difficult to change, exacerbating the problem.
The larger the system grows over time, the more people working on it over time, the faster this incidental complexity problem grows over time. At a certain point, it's too expensive to not fix the bugs because of the increasingly high cost of building new features. At that point, folks start clamouring for a rewrite, and the cycle begins anew.
The problem is: is your rewrite really going to be a full-rewrite, or some kind of hybrid monster (at the architectural level, of course, there is no problem in reusing little independent pieces, if any exist)? Because you can easily fall in all the traps of both sides, if the technical side is not mastered well enough by the project management...
Maybe it is a bug that only affects 1 out of every 10,000 customers. But if you get enough of those it can start to add up. Keeping track of them allows you to go to management with the data to support spending a sprint on bugfixes and code maintenance.
Which of these two is more important?
- 0.1% of my users lose their data irrecoverably
- 30% of my users get an error page and have to refresh
The article never waded into this at all, which is disappointing. I don't feel like I learned anything.There are some categories of bugs that must always be fixed, regardless of how infrequently users run into them - security, privacy, accessibility, data loss.
There are also cases where many low impact bugs all share a common root cause - the value of fixing any one bug is low, but the sum of fixing all current and preventing all future occurrences is high value. Enforced static analysis tools (like error prone for Java) and libraries/frameworks with safety checks (autoescaping template languages, polyfills, etc) are a great way to address these long tail bugs. I generally write a new compiler error after encountering the same bug class three times.
* Not all are worth fixing, that is, the (financial) upside of their being fixed is too low compared to the effort required.
* And it's okay, that is, it's something we have to accept and live on, though it's not really nice and satisfying.
Myth of Five Nines - why high availability is overrated https://www.iheavy.com/2012/04/01/the-myth-of-five-nines-why...