undefined | Better HN

0 pointschriskrycho2y ago0 comments

The trick is that in many cases the value delivered is invisible and unmeasurable. How do you quantify “time saved by not having bugs”? But that is what great maintenance does. Or, the same for “time saved by a really well-designed API that makes it easy to do the right thing and harder impossible to do the wrong thing”? Again: not measurable! ”Just put a number on it” is the kind of facile response I consistently get from too many folks in management when trying to have these kinds of discussions, and the annoying-but-inescapable reality is that it is not always possible to provide a monetary number on the value of this sort of work. Despite that value often very likely netting out in the millions or more every year!

0 comments

5 comments · 1 top-level

Archelaos2y ago· 4 in thread

> ... it is not always possible to provide a monetary number on the value of this sort of work. Despite that value often very likely netting out in the millions or more every year!

Hm. You fist state, that it is not possible to provide a monetary number, then you state it is very likely netting out in the millions -- which is providing a monetary estimate.

tomjakubowski2y ago

If the value of one thing is somewhere in the range of $1-$100, then its value is hard to quantify.

But if you have one million of those things, you can still say "it's very likely I have value in the millions of dollars or more here".

The same logic applies here. All that has to be true for "we have millions here" to be plausible is that (1) the value of each individual, unquantified contribution is positive and probably >$1 (2) there are probably millions of such contributions. You don't have to be able to quantify with any precision any individual contribution.

lovich2y ago

Then put on the imprecise number? If you are looking for precision in estimates for the impact of projects you worked on, the vast majority of hiring managers reading resumes won’t care. They’re already going to be mentally sorting impact into broad buckets

chriskrychoOP2y ago

This is a totally reasonable response! So let me elaborate a little on how these things can be true at the same time.

1. Imagine a scenario where there are two versions of an API: one is bug-prone, the other is “correct by construction”—you literally cannot call it the wrong way.

2. Assume that for some percentage of the “invalid” versions of the bug-prone API are called, the result is something that ends up going wrong in production and taking 3 developers an hour to resolve. (This kind of level-of-effort is not at all unusual in my experience dealing with on-call at both a mid-sized startup and at the scale of LinkedIn!) Let’s call it 10% to pick a reasonably small number: only 1 out of 10 bad invocations for this API put us here.[1]

3. Assume the API is fundamental to some key library (a JS framework you use, for example), so the calls are proportional to the size of the code base. Again, pick a fairly low number: 1 mistaken call every 10,000 lines of code. If we are looking at LinkedIn’s front-end, that puts us on the order of well over 10 of these that actively cause this problem (over a million lines of code with a 0.1% “hit” rate and a 10% “blows up” rate).

4. Further take an average developer compensation of $150,000/year. (This is low for big tech, but again, it gives us a useful baseline.) This is ~$75/hour.

Put those together, and you’re talking about 100 incidents × 3 developers × 1 hour/incident × ~$75/hour/developer = $22,500. That’s one repeated bug over the lifetime of the program in question.[2] That excludes the other potential business costs there: what happens if that also impacts revenue in some way—say, because it prevents sales, or means lots ad revenue, or results in an SLA violation?

Add that up across the whole surface area of a codebase—dozens and dozens of bugs, across however many users and lines of code—and you’re talking real money. A million dollars is just 450 of those kinds of bugs with similar “blast radius” and occurrence rate. This is the kind of rough mental math that leads me to talk about “netting out in the millions” benefit-wise. Thus far you could imagine “putting a number on it”.

Where it goes wrong is: with the good version of that API, the bug never happens. There is nothing to measure, because our reasoning has to deal entirely in counterfactuals: “What would it have cost us if we had a bug in this particular part of the framework?” But you can do that ad infinitum.

More or less every part of a library can be more or less buggy, more or less easy to maintain, more or less amenable to scaling up to meet the needs of an application which uses it, more or less capable of adding new capabilities without requiring you to rewrite it, etc. The part that is impossible to measure is the benefit of all the “right” decisions along the way: the bugs you never saw, indeed never even had to think about because the API just made them impossible in the first place.

Nor can you measure “this API is easy to use and never breaks my flow” vs. “I spend at least a minute looking up the details every time I have to use it… and whoops, now I’m on Reddit because I switched to my browser from my code editor”. Nor can you measure the impact of “This API makes me angry” vs. “This API makes me actively happy” on velocity. The closest you get are proxy measures like NSAT surveys which tell you how developers feel overall and interviews where you can ask them what their papercuts are; but neither can be translated into dollar values in a meaningful way. And “putting on the imprecise number” (as a sibling comment down the thread suggests) is impossible for these kinds of things: there is no number.

[1]: Lest you think I am gaming this, I have real APIs we really deal with in mind which are so error prone that we deal with bugs like this from that specific API at least once a month.

[2]: Off the top of my head, I can think of half a dozen APIs we use very actively in production which have these kinds of problems. I have eliminated a fair number of them in my tenure, but demonstrating the impact is… well, see above.

Archelaos2y ago

I am still not convinced that your examples show that it would be unreasonable to estimate their monetary impact.

Since a company is an organic whole, every functional part of it would netting out in the millions if considered in isolation. However, such a perspective is usually without practical significance, as it is not linked to concrete business-relevant scenarios. If there is no risk[1] that something that functions properly could fail, then the costs associated with such a failure never occurring are irrelevant.

I would also concede that many things cannot be estimated accurately or might be very hard to estimate. But in my experience, the really difficult decisions are those that relate to new big and complex things, such as what technology to use for an innovative product. Evaluating whether it is worth improving a specific detail of an existing application is most of the time far less difficult.

Let me give you an example from my current work: It is a business application to process customer enquiries that result in an offer for a tailored product. In a specific scenario we know that we can process 5 enquiries per hour. The goal is to process 6, an increase of 20%. There are about 6,000 enquires per month, meaning saving 200 staff-hours per month. The hourly costs for software development are about 4 times the costs for the staff using the application. That means that for every 50 hours it would take me to reach that goal, the break-even point would move by one month. I estimated that I could reach the goal by putting betwenn 100 and 150 hours into it. This precision was enough to get the green light from the management.[2] And management does not really care how I reach that goal in detail (by improving the performance of the database, by reworking the user interface, by using better templates as a basis for the tailored products, ...). And even if my estimates were off by a factor of 2 or 3, it would still be worthwhile to attempt the improvement.

Regarding your case about the quality of an API, I cannot see a fundamental difference to the case I just described. Set some time and/or quality goals for the improvement of the API and attache a reasonable price point to everything. Than see, whether it makes economic sense at all, whether something else promisses a better return on investment, or if this is the best thing to do now.

Finally, I would like to emphasise that the correct thing to do is never only purely a question of technology. Notice, for example, how the assessment of the case I described above changes with the number of enquires per month. Were there only 1,000 enquires per month, the break even point would be 6 times further away, which means that there might probably exist a lot more other fields more worthwhile for the company to invest their money in (and not all in IT).

[1] More precisely, the risk is seen as marginal or irrelevantly small, or if it occurs there is no way to manage it anyway (a meteorite hits the factory), or circumstances are so fundamentally changed that the entire business model is called into question.

[2] Actually it was the other way round: The management came up with the idea to improve the process by 20% and already had some suggestions how to do it. Then I looked at it and gave my rough estimates and own suggestions what could be done.

j / k navigate · click thread line to collapse