Yes, it’s inefficient. Yes, some people want that!
If she had 0, she ran the risk of turning customers away and losing money. Any more than 1 is excess waste. Having just 1 meant she’d served every possible customer and only “wasted” 1 slice.
Eg at Google (this was ten years ago or so), we could always spend leftover networking capacity on syncing a tiny bit faster and more often between our data centres. And that would improve users' experience slightly, but it also not something that builds up a backlog.
At a factory, you could always have some idle workers swipe the floor a bit more often. (Just a silly example, but there are probably some tasks like that?)
For example, you can assign priorities to the loads on your systems, so that you can shed lower priority loads to create some slack for emergencies, without having to run your system idle under during lulls.
I get what the article is trying to say, but they shouldn't write off optimisation as easily as that.
So you’re fixing the micro economics of the queue but not the macro. Queues still suck when they fill up, even if they fill with last minute jobs.
Eg if you are running video conferencing software, and all of a sudden you are having bandwidth problems, you typically first want to drop some finer details in the video, and then you want to drop the audio feed.
In any case, if you dropped something, you leave it dropped, instead of picking it back up again a few seconds later. People don't care about past frames.
(However, queuing instead of outright dropping can still makes sense in this scenario, for any information that's younger than what human reaction times can perceive.)
Similarly in your scenario, you'd want to explicitly communicate to people what the expectations are. Perhaps you give out deep discounts for tasks that can be dropped (that's what eg some electriticy providers do), or you can give people 'insurance' where they get some monetary compensation if their task gets dropped. (You'd want to be careful how you design such a scheme, to avoid perverse incentives. But it's all doable.)
> So you’re fixing the micro economics of the queue but not the macro. Queues still suck when they fill up, even if they fill with last minute jobs.
I don't know, I had pretty positive experiences so far when eg I got bumped off a flight due to overbooking. The airline offered decent compensation.
Overbooking and bumping people off _improves_ the macro situation: despite the occasional compensation you have to pay, when unexpectedly everyone who booked actually showed up, overbooking still makes the airline extra money, and via competition this is transformed into lower ticket prices. Many people love lower airfares, and have shown a strong revealed preference of putting up with a lot of stuff eg RyanAir pulls as long as they get cheap tickets.
There’s no room to absorb shocks. We saw a drastic version of this during COVID-19 induced supply chain collapse. Car manufacturers had built near 100% just in time manufacturing that they couldn’t absorb chip shortages and it took them years to get back up.
It also leaves no room for experimentation. Whatever experiment can only happen outside a system not from within it.
1. Firms compete
2. Firms either increase their efficiency or die
3. Efficient firms are more susceptible to shocks
4. Firm shutdown and closures are themselves shocks
5. Eventually the system reaches a critical point where the aggregate susceptibility is higher than the aggregate of shocks that will be generated by shutdowns and closures
6. Any external shock will cause a cascade
There's essentially a "commons" where firms trade susceptibility for efficiency. Or in other words, susceptibility is pooled while the rewards for efficiency are separate.
A species will specialise for a niche, and outcompete a generalist. But when conditions change, the generalist can adapt and the specialist suffers.
Something you personally (in your head) believe to be a general law, or rule, or truth (canon). It's roughly synonymous with "mental model".
A cannon is a weapon.
1. Firms compete
2. Some firms get ahead
3. Accrued advantages to being ahead amplify
4. A small number of firms dominate
5. New competition is bought or crushed
6. Dominate firms become less efficient in competition-free environment
There is an odd corollary, which is that capitalistic systems which reward efficiency gains and put downward pressure to incentivize efficiency, deal with the resilience problem by creating entirely new subsystems rather than having more robust subsystems, which is fundamentally inefficient.
Is what you’re saying that capitalism breaks down resilience problems into efficiency problems?
I think that’s an extremely motivating line of thinking, but I’ll have to do some head scratching to figure out exactly what to make of it. On one hand, I think capitalism is really good at resilience problems (efficient markets breed resilience, there’s always an incentive to solve a market inefficiency), on the other (or perhaps in light of that) I’m not so sure those two concepts are so dialectically opposed
I don't know what you mean by reverse.
Besides, angels can't really balance on pinheads.
Among his notable accomplishments, he and coauthors mathematically characterized the propagation of signals through deep neural networks via techniques from physics and statistics (mean field and free probability theory). Leading to arguably some of the most profound yet under-appreciated theoretical and experimental results in ML in the past decade. For example see “dynamical isometry” [1] and the evolution of those ideas which were instrumental in achieving convergence in very deep transformer models [2].
After reading this post and the examples given, in my eyes there is no question that this guy has an extraordinary intuition for optimization, spanning beyond the boundaries of ML and across the fabric of modern society.
We ought to recognize his technical background and raise this discussion above quibbles about semantics and definitions.
Let’s address the heart of his message, the very human and empathetic call to action that stands in the shadow of rapid technological progress:
> If you are a scientist looking for research ideas which are pro-social, and have the potential to create a whole new field, you should consider building formal (mathematical) bridges between results on overfitting in machine learning, and problems in economics, political science, management science, operations research, and elsewhere.
[1] Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks
http://proceedings.mlr.press/v80/xiao18a/xiao18a.pdf
[2] ReZero is All You Need: Fast Convergence at Large Depth
https://books.google.co.uk/books/about/Tracts_N_50_Antidote_...
As for society itself being robust, it’s a much harder property. Being robust is nice but no one actually wants to live in a metered society where there’s insufficient resources - they’d generally rather kill for resources greedily and let others fail without helping them. That’s why socialized healthcare struggles - while it guarantees a minimum of care for everybody, the care provided has longer wait times and most people are not willing to wait their turn.
Healthcare is more complicated. It can never work as an efficient free market since nobody goes comparison shopping for the hospital with the best value-for-money when they have a car crash. That's why socialized healthcare achieves much better results per dollar spent. But it's often hamstrung by attempts at efficiency.
I think a better societal example is disaster relief: helping people back up after they have been hit by a hurricane is the humane thing to do, but how much is that encouraging people to settle in high risk areas with insufficient precautions?
(This is quite unlike the common view that businesses inevitably grow to take over the world.)
I.e. business is much like a living organism.
Problems set in when the government bails out failing businesses.
Even worse are government "businesses". They are not allowed to fail, and the inefficiencies, parasites, corruption, grow and grow. When can you remember a government agency being abolished? Eventually, the government will collapse.
And the owners could have sold when the business was propped up by unknown fragility.
Human lives are too short for these kinds of feedback loops to be all that effective.
Maximizing efficiency in the short-term is not the same as maximizing survival in the long term.
[0] https://www.sciencedaily.com/releases/2017/09/170908205356.h...
This is something that Nassim Taleb and the people working on https://realworldrisk.com/ have been saying for decades already.
Highly optimized systems take full advantage of their environment and rely on a high degree of predictability in order to avoid redundant operations.
These systems minimize the free energy in the system, and so very little free energy is available to counteract new forces introduced to the environment which act on the system.
You'll find parallels in countless domains, since the very basis for learning and stabilization of a system revolves around becoming more or less sensitive to a given stimulus. Examples could be attention, supply chain economics, institutions, etc.
The only solution I see is for the FDA to include supply reliability in it's determination of whether a system is acceptable.
>capital concentration increases
>expectations for what capital owners can do with money increases
>expectations exceed available capital
>investment returns must increase (race to the top)
>cooperation among capital owners must increase to get better returns
>capital owning group begins to self-select and become less diverse, if this wasn't already caused by the background/personality required to accrue capital
>investment theory converges on a handful of "winning" ventures
>because this is where capital is flowing, workers are forced to divert to these ventures
>competition increases, hyperspecialization increases
>expertise in and sophistication of other areas begins to decline, causing quality decline, garnering less investment; feedback loop
-----
*debt cannibalizes future productivity
-----
)diversity in capital ownership and management increases likelihood of diversity in investment venture target
)increased competition, increased likelihood that ventures will cover needs, decreased likelihood of overweighting in one area/overproduction
)solution: capital redistribution. Perhaps globally
It does both, eg. if the environment is stable then fitness is correlated with efficiency, if the environment is unstable then it's robustness.
It's tempting to minimize waste, but excess capacity is required to adapt if things are evolving quickly.
His main thesis is that very high performance (which he defines as efficacy towards a known goal plus efficiency) and very high robustness (the ability to withstand large fluctuations in the system) are physically incompatible.
…what about humans? We’re far more efficacious than any other animal, and far more capable of behavioral adaptation.Plus, isn’t “physically impossible” a computer science argument, not a biological one? Unless we’re using the OG “physis”==“nature”, I guess
I know I'd tolerate a digital experience of far lower fidelity (fewer pixels, for instance, or even giving up GUIs altogether) if I could get it in a way that doesn't break every time some far away person farts near a cloud console: A trade of performance for robustness.
Why do you think this?
Translation to laymen: ML is being analogized to the mathematical structure of signaling between entities and institutions in society.
Mathematician proposes problem that plagues one (overfitting in ML, the phenomena by which a neural network's ability to generalize is negatively impacted by overtraining so the functions it can emulate are tightly coupled to the training data), must plague the other.
In short, there must be a breakdown point at which overdevelopment of societal systems or signaling between them makes things simply worse.
I personally think all one need do is look at what would happen if every system were perfectly complied with to see we may already be well beyond that breakpoint in several industrial verticals.
Deep Network | xi+1 = F(xi)
Residual Network | xi+1 = xi + F(xi)
Deep Network + Norm | xi+1 = Norm(F(xi))
Residual Network + Pre-Norm | xi+1 = xi + F(Norm(xi))
Residual Network + Post-Norm | xi+1 = Norm(xi + F(xi))
ReZero | xi+1 = xi + αi F(xi)
However, I haven't actually seen this used in practice.
The papers we have on Gemma and Llama all still seem to be using layer norms.Am I missing something?
With the idea that there is some subset of logic that sits below economics that is provable and exact. That is a powerful idea worth pursuing!
It's certainly an interesting perspective on the development of complex systems. The idea that an economy can be somehow overfitted to its own incentives and constraints I don't think is entirely new, cf the Beer Game. But as a general concept, it's certainly not something that usually finds its way into policy discussion, beyond some very specific talk about reshoring of certain critical industries.
However, I think the most important benefit of this perspective is going to be providing yet another counterargument against the Austrian economics death cult.
There was also something about lower state expenditures (...taxes...) giving better results for the people - that's the one that seems to be very popular with rich people for some reason. Go figure.
That, in my view, is a far too reductionist view of the problem. The problem isn't just about measurement, it's about human behavior. Unlike particles, humans will actively seek to exploit any control system you've set up. This problem goes much deeper than just not being able to measure "peace, love, puppies" well. There's a similar adage called Campbell's law [0] that I think captures this better than the classic formulation of Goodhart's law:
The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.
The mitigants proposed (regularization, early stopping) address this indirectly at best and at worst may introduce new quirks that can be exploited through undesired behavior.
But that’s only possible because the control system doesn’t exactly (and only) control what we want it to control. The control system is only an imperfect proxy for what we really want, in a very similar way as the measure in Goodhart’s law.
Another variation of that is the law of unintended consequences [0]. There is probably a generalized computational or complex-systems version of it that we haven’t discovered yet.
[0] https://www.sas.upenn.edu/~haroldfs/540/handouts/french/unin...
Start working with a nice, clean, fully relevant system, end up modelling that plus the whole range of adversarial perturbations from agents of pretty high complexity.
Well, agents will. If you created a genetic algorithm for an AI agent whose reward function was the amount of dead cobras it got from Delhi, I feel like you'd quickly find that the best performing agent was the one that started breeding cobras. In the human case and in the AI case the reward function has been hacked. In the AI case we decide that the reward function wasn't designed well, but in the human case we decide that the agents are sneaky petes who have a low moral character and "exploited" the system.
1. "Agents want to retain or increase their agency"
2. "Agents will subvert rules that decrease their agency"
3. "Agents seek resources to increase their agency"
This field needs to be studied, I think I need to apply for a grant (3rd law says so).
Which one is useful or descriptive will depend on the specific example.
Optimizing ML VS Optimizing a social media algorithm VS using standardized testing to optimize education systems.
There is no perfect abstraction that applies to these different scenarios precisely. We don't need that precision. We just need the subsequent intuition about where these things will go wrong.
There do happen to be citations for this question but I doubt any really clears an "indisputable evidence" standard. That's the nature of the field. Even if the whole discussion was evidence based and dotted with citations, we'd still be working with a lot of intuition and speculation.
The flexibility necessary to succeed in a real world requires a certain level of inefficiency.
My point is that error correction codes have a precise mathematical definition and have been deeply studied. Maybe there is a general principle at work in the wider world, and it is amenable to a precise proof and analysis? (My guess is that mileage may be made by applying Information Theory, as used to analyse error correcting codes.)
All of the examples involve a bad proxy metric, or the flawed assumption that spending less improves the ratio of price to performance.
to quote wikipedia quoting Sickles, R., and Zelenyuk, V. (2019). "Measurement of Productivity and Efficiency: Theory and Practice". Cambridge: Cambridge University Press.
Offering that criticism without clarifying what efficiency measures in your opinion doesn't allow us to follow your viewpoint without us just taking your word for it. Needless to say this isn't considered good style in a discourse.
A 100 percent "efficient" system can be one that is overfitted to certain metrics and it is the typical death sin of management to confuse metrics with reality and miss that their great numbers hollow out anything that makes a system work well and reliable, because guess what: having 1 critical employee and working them like a mule is good when things work, but bad when they suddenly don't, because that second employee you thought was fat that could be cut, was your fallback. In that case your metric of efficiency was slightly increased while another, less easy to quantify (and therefore often non-existent) metric of resilience went down significantly. This means if your goal was having an efficient and resilient company, but your metric only measured the former, guess what.
Same is true in engineering, where you can optimize your system so much to fit your expected problem, one slight deviation within the problem now stops the whole thing from working alltogether (F1 racing car when part of the track turns out to be a sucky dirtroad). Highly optimized systems are highly optimized towards one particular situation and thus less flexible.
Or in biology, where everybody ought to know that mixed woods are more resilient to storms and other pests, while having great side effects for the health of the ecosystem, yet in pure economic terms it is easy to convince yourself the added effíciency of a monoculture is worth it economically, because all you look at is revenue, while ignoring multiple other metrics that impact reality.
It eventually becomes a bad proxy metric.
What it means is the objective can't be static - for example once satiated, you need to pick different one to keep improving globally. Or do something else that moves the goalpost.
A corollary would be some other relation that can be deduced as a result of P implies Q, not simply a restatement of P implies Q.
(Using the discrete math definition of imply, not the colloquial definition of imply).
To a normal person, there are a lot of good proxy indicators of fitness. You could train sprinting. You could hop up and down. Squat. Clean and jerk.. etc.
Running faster,hopping higher, squatting heavier... all indicators of increasing fitness... and success of your fitness training.
Two points:
1 - The more general your training methodology, the more meaningful the indicators. Ie, if your fitness measure is "can I push a car uphill," and your training method is sprinting and swimming... pushing a heavier car is a really strong indicator of success. If your training method is "practice pushing a car," then an equivalent improvement does not indicate equivalent improvement in fitness.
2- As an athlete (say clean and jerk) becomes more specialized... improvements in performance become less indicative of general fitness. Going from zero to "recreational weighlifter" involves getting generally stronger and musclier. Going from college to olympic level... that typically involves highly specialized fitness attributes that don't cross into other endeavors.
Another metaphor might be "base vs peak" fitness, from sports. Accidentaly training for (unsustainable) peak performance is another over-optimization pitfall. It can happen when someone blindly follows "line go up." Illusary optimizations are actually just trapping you in a local maxima.
I think there are a lot of analogies here to biology, but also ML optimization and social phenomenon.
Not sure these are the best example. I don't know of anyone who can C&J more than their body weight for multiple repetitions who isn't also an absolute terminator at most other meaningful aspects of human fitness.
Human body is one machine. Hormonal responses are global. Endurance/strength is a spectrum but the whole body goes along for the ride.
> Hormonal responses are global. Endurance/strength is a spectrum but the whole body goes along for the ride.
This is true, and that is why most exercise is a general good for most people, and has similar physiological effects. However, at some point "specialization" (term of art), kicks in. At that point, a bigger clean and jerk no longer equates to a longer shot put.
Fwiw... This isn't a point about exercise or how to exercise. Most people aren't that specialized or advanced in a sport and the ones who are have coaches. My point is that the phenomenon speculated to be broad in this post applies (I suspect) to physiology. Probably quite broadly. It's just easy to think about it in terms of sports because "training & optimization" directly apply.
This isn’t just nit picking exercises here. There are some measures to optimize for that lead to broader performance. They tend to be more complex and test all components of the system.
If you’re curious about GDP. I my car breaks and I get it fixed, that adds to GDP.
If a parent stays home to raise kids, that lowers GDP. If I clean my own house that lowers GDP. Etc.
Unemployment is another crude metric. Are these jobs people want or do they feel forced to work bad jobs.
The general discourse (news, politicians, forums, etc.) over a couple of measures will always be highly simplifying. The discourse over thousands of measures will be too complex to communicate easily.
I hope that at some point most people will acknowledge implicitly that the fewer the number of measures the more probable is that it is a simplification that hides stuff. (ex: "X is a billionaire, means his smart"; "country X has high GDP means it's better than country Y with less GDP" and so forth).
But the larger the number of measures, the more free variables you have. Which makes it easier to overfit, either accidentally or maliciously.
Here's a rough outline for one proposed alternative to capitalism and the failed central planning alternatives of the past:
https://jacobin.com/2019/03/sam-gindin-socialist-planning-mo...
Some relevant snippets:
> Though planning and worker control are the cornerstones of socialism, overly ambitious planning (the Soviet case) and overly autonomous workplaces (the Yugoslav case) have both failed as models of socialism. Nor do moderate reforms to those models, whether imagined or applied, inspire. With all-encompassing planning neither effective nor desirable, and decentralization to workplace collectives resulting in structures too economically fragmented to identify the social interest and too politically fragmented to influence the plan, the challenge is: what transformations in the state, the plan, workplaces, and the relations among them might solve this quandary?
> The operating units of both capitalism and socialism are workplaces. Under capitalism, these are part of competing units of capital, the primary structures that give capitalism its name. With socialism’s exclusion of such private units of self-expansion, the workplace collectives are instead embedded in pragmatically constituted “sectors,” defined loosely in terms of common technologies, outputs, services, or simply past history. These sectors are, in effect, the most important units of economic planning and have generally been housed within state ministries or departments such as Mining, Machinery, Health Care, Education, or Transportation Services. These powerful ministries consolidate the centralized power of the state and its central planning board. Whether or not this institutional setup tries to favor workers’ needs, it doesn’t bring the worker control championed by socialists. Adding liberal political freedoms (transparency, free press, freedom of association, habeas corpus, contested elections) would certainly be positive; it might even be argued that liberal institutions should flourish best on the egalitarian soil of socialism. But as in capitalism, such liberal freedoms are too thin to check centralized economic power. As for workplace collectives, they are too fragmented to fill the void. Moreover, as noted earlier, directives from above or competitive market pressures limit substantive worker control even withinthe collectives.
> A radical innovation this invites is the devolution of the ministries’ planning authority and capacities out of the state and into civil society. The former ministries would then be reorganized as “sectoral councils” — structures constitutionally sanctioned but standing outside the state and governed by worker representatives elected from each workplace in the respective sector. The central planning board would still allocate funds to each sector according to national priorities, but the consolidation of workplace power at sectoral levels would have two dramatic consequences. First, unlike liberal reforms or pressures from fragmented workplaces, such a shift in the balance of power between the state and workers (the plan and worker collectives) carries the material potential for workers to modify if not curb the power that the social oligarchy has by virtue of its material influence over the planning apparatus, from information gathering through to implementation as well as the privileges they gain for themselves. Second, the sectoral councils would have the capacity, and authority from the workplaces in their jurisdiction, to deal with the “market problem” in ways more consistent with socialism.
> Key here is a particular balance between incentives, which increase inequality, and an egalitarian bias in investment. As noted earlier, the surpluses earned by each workplace collective can be used to increase their communal or individual consumption, but those surpluses cannot be used for reinvestment. Nationwide priorities are established at the level of the central plan through democratic processes and pressures (more on this later) and these are translated into investment allocations by sector. The sector councils then distribute funds for investment among the workplace collectives they oversee. But unlike market-based decisions, the dominant criteria are not to favor those workplaces that have been most productive, serving to reproduce permanent and growing disparities among workplaces. Rather, the investment strategy is based on bringing the productivity of goods or services of the weaker collectives closer to the best performers (as well as other social criteria like absorbing new entrants into the workforce and supporting development in certain communities or regions).
...
> No one paid greater economic homage to capitalism than the authors of The Communist Manifesto, marveling that capitalism “accomplished wonders far surpassing Egyptian pyramids, Roman aqueducts, and Gothic cathedrals.” Yet far from seeing this as representing the pinnacle of history, Marx and Engels identified this as speaking to a new and broader possibility: capitalism was “the first to show what man’s activity can bring about.” The task was to build on this potential by explicitly socializing and reorganizing the productive forces.
> In contrast, for Hayek and his earlier mentor von Mises, capitalism was the teleological climax of society, the historical end point of humanity’s tendency to barter. Hayek considered it a truism that that without private property and no labor and capital markets, there would be no way of accessing the latent knowledge of the population, and without pervasive access to such information, any economy would sputter, drift, and waste talent and resources. Von Mises, after his argument that socialism was essentially impossible was decisively swept aside, turned his focus on capitalism’s genius for entrepreneurship and the dynamic efficiency and constant innovation that it brought.
> Despite Hayek’s claims, it is in fact capitalism that systematically blocks the sharing of information. A corollary of private property and profit maximization is that information is a competitive asset that must be hidden from others. For socialism, on the other hand, the active sharing of information is essential to its functioning, something institutionalized in the responsibilities of the sectoral councils. Further, the myopic individualism of Hayek’s position ignores, as Hilary Wainwright has so powerfully argued, the wisdom that comes from informal collective dialogue, often occurring outside of markets in discussions and debates among groups and movements addressing their work and communities.
Around a decade ago, the store installed anti-theft cages.
At first they only kept high-dollar items in the cages. It was a bit of an inconvenience, but not so bad. If a customer is dropping $200+ on some fancy power tool, he or she likely doesn't mind waiting five minutes.
But a few years later, there was a change - almost certainly a 'data-driven' change: suddenly there was no discernible logic to which items they caged and which they left uncaged. Now a $500 diagnostics tool is as likely to sit open on a shelf, as a $5 light bulb to be kept under lock and key.
Presumably the change is a result of sorting a database by 'shrinkage' - they lock up the items that cumulatively lose the hardware store the most money, due to theft.
But the result is (a) the store atmosphere reads as "so profit-driven they don't trust the customers not to steal a box of toothpicks" and (b) it's often not worth it for customers to shop there due to the waiting around for an attendant to unlock the cage.
I doubt the optimization helped their bottom-line, even if it has prevented the theft of some $3 bars of soap.
(1) locate the aisle with the item I want to verify it is still in stock
(2) spend 5 minutes wandering around the isles looking for a free attendant
(3) give up, press the 'service' button
(4) wait at the service desk for another 5 minutes until an attendant arrives to unlock the cage and hand me my $10 item
(5) have attendant inform me that he must immediately frogmarch me to the nearest self-checkout station, even though I had further items I wanted to buy. Otherwise, I would have a chance to secret the $10 item in my pocket, or to dash for the door?
(6) self-checkout and walk through the parking lot to place the item in my vehicle
(7) walk back to the store to buy my other items
(8) realize one of my other items is also locked in a cage
(9) got back to step (1) and repeat for a second item
I realize, as I type this, that the store has turned into the American equivalent of the British 'catalog merchant' except, in the UK, a catalog store gives you a numbered ticket, so the customer has, at least, a streamlined procedure and knows what to expect.
> I doubt the optimization helped their bottom-line
These seem to be in direct contradiction, unless you really think people have stopped going there because of it, to such an extent it outweighs the thefts. Especially when, if they stop going there, the competing local hardware superstore is probably doing the exact same thing. And remember, retail margins aren't usually huge -- for every item stolen, how many more do they need to sell to recoup the loss? Even if some people go to Amazon instead, it can still be worth it to avoid the theft.
It's much more likely that it has indeed had the biggest impact on reducing theft, and that your "discernable logic" simply doesn't have experience with these things -- that theft often isn't about item value, but rather about reliable resellability. A single expensive niche power tool takes a long time to resell; laundry detergent and razor blades can be unloaded in quantity the same day. People go through detergent and razors a lot faster than light bulbs.
I understand you dislike the inconvenience. But I really think you should be blaming the thieves or the factors behind theft, not the stores.
It is possible for a business to make money without customers actually liking the company: hey, it works for some of the FA*NG companies!
That said, there is something that feels 'off' about management obsessing over shrinkage to the point that the shopping experience begins to suck. It's not a truckstop or a drug store in a bad area... it's a hardware superstore.
With too much data, some manager can fixate on $3 screwdriver thievery and not think about the bigger picture: like shoppers finding the store to be a pain in the ass, and therefore no longer an attractive place to buy expensive riding-lawnmowers and floor jacks.
A store can quantify lower sales figures, but it may not be obvious that the lower sales were related to the choice of 'caged vs uncaged' inventory.
But again, I do not know. I only suspect.
I think that's where your misunderstanding might be.
Hardware superstores are extremely attractive targets for huge quantities of theft. You just seem to personally not be aware of it, and are under the mistaken impression that theft is something that happens mainly in truckstops in bad areas.
I think your suspicions are simply based on incorrect assumptions, both around the types of items that thieves steal, as well as the kinds of stores thieves target.
I was looking at Thinkpads and was somewhat shocked to see they started doing that too!
Free market is actually less efficient than direct control, but it is correspondingly more robust. This is evidenced in the big companies, which also sometimes try to control things in the name of "efficiency" and end up being quite inefficient. And also small companies, which are often competing and duplicating efforts.
The optimum (I hesitate it call that because it's not well-defined, it is in some sense a society's choice) seems to be somewhere in the middle - you need decent amount of central direction (almost all private companies have that) and redundancy (provided by investment funds on the free market).
(As aside, despite me being democratic socialist, I don't believe the democracy matters that much for economic development, but is desirable from a moral perspective. You can have a lot of economic development under authoritarian rule, there are examples on both sides, as most private companies are also actually small authoritarian fiefdoms.)
- https://amp.theguardian.com/books/2010/aug/08/red-plenty-fra...
- https://chris-said.io/2016/05/11/optimizing-things-in-the-us...
Centralization does that, in general, not just in those countries.
There is a reason octopuses have sub-brains in their arms, and that some of our reflexes are controlled from neurons in the spine and not from all the way up in the brain, and why small army units have some autonomy.
If you want to make an optimal decision, you need to make it in a centralized fashion, in some form. But that also gives you a single point of failure.
Well, a lot of it was corruption. A sufficient level of corruption can destroy almost any system, even if it had a well-meaning leader at the top.
Do we have any reasonable datasets for before-and-after corruption levels in the FSU, or would that be a project which would* need another 3-5 decades to be viable?
* in the absence of sufficient well-placed cabbage?
EDIT: circumstantial, but chin-scratch-worthy: https://en.wikipedia.org/wiki/Glasnost#Opposition (romania's post-communist transition was exceptionally** violent)
** here I count the stans as having suffered from preexisting violence
The trick is as always to find out the XY problem. What they really need may be way easier for you to implement than what they actually asked for.
If you are in the business of selling any product or service, then it's great that finding a way to make it cheaper also generates more demand for you.
If it turned out that the term actually started in tailoring before statistics really got it's feet under it (which I absolutely cannot say that it did, just that trying to extrapolate backwards that sounds like a reasonable guess) then it wouldn't speak poorly of you if you hadn't also known that.
The reason I have this quibble is because the author says things like
>you should consider building formal (mathematical) bridges between results on overfitting in machine learning, and problems in economics, political science, management science, operations research, and elsewhere
If we are appropriately modest and acknowledge the fact that overfitting is well-studied by statisticians (although, obviously not in the context of deep neural networks), it seems kind of ridiculous to make statements like, economists and political scientists should consider using statistics?
I wonder why the author called it that way when this seems to me clearly derived from Ross Ashby‘s law of Requisite Variet[1], predating Goodhard by 20 years. As I see it, it is not even necessary to put more meaning it Goodhard as there actually is. Requisite Variety is sufficient. Going by his resume, I strongly assume the author knows this.
Russel Ackoff, building on countless others, put into two sentences for which others needed two volumes:
“The behaviour of a system is never equal to the behaviour of its parts. - It is the product of their interactions.“
Especially Systems Theory in its second manifestation (Maturana, Luhmann, von Förster, Glasersfeld - and Ackoff) is extremely powerful, deep and, reasons beyond me, totally overlooked.
Have to say tho, most MBA‘s I encountered sadly never ever heard of Cybernetics or Systems Theory. :-(
> Proxy: Capitalism
> Strong version of Goodhart's law leads to: Massive wealth disparities (with incomes ranging from hundreds of dollars per year to hundreds of dollars per second), with more than a billion people living in poverty
Please, show me a point in all human history when we have less than 90% global population living in poverty, pre-capitalism. Yes, there are 1 billion people (out of 8 billion) living in poverty today. But they were 2 billion (of 4.5 billion total) living in poverty as recently as 1980 (https://www.weforum.org/agenda/2016/01/poverty-the-past-pres...).
Poverty is steadily going down (https://www.weforum.org/agenda/2016/01/poverty-the-past-pres...) since we have data. The first countries to get rid of recurrent famines were the same that first adopted capitalism. The same countries where their population started having higher expectations than to live another day.
Paraphrasing Churchill about democracy, "[capitalism] is the worst economic system except for all other systems that has been tried from time to time".
Everything is about technology—stop letting economists drag you into stupid, poorly formed debates using undefined terms like “capitalism.”
The raise of capitalism, that indeed is such thing meant the raise of systematic investment in capital. The idea of risk investing in a machine that might have a ROI was new, and the idea that the extra money earned with that investment could be used to invest even more was radically new.
Is rich to claim that "China and Russia also ended famine by 50s and 70s". Each of them had a massive and historical famines as late as 1932 and 1960, and the famine was directly caused by their economic system, proving central planning as a failure: it seem to work until the planners make some error. When that error happens sooner or later, the failure is catastrophical and generalized. In fact, China partially abandoned central planning to raise from poverty in the 70's you menction as the end of China famines, when Deng Xiaoping said that a socialist state could use the market economy without being capitalist (https://en.wikipedia.org/wiki/Socialist_market_economy). The new system means that China is politically socialist, but economically at least partially capitalist. For example, it meant that farms no longer were owned by the state. Politically they did some words juggling of "socialism", "market" and "capitalism", but in practice they just adopted capitalism as the main capital allocating force. Central planning was abandoned in the early 90s, and they adopted a western-like planning (fiscal and monetary policies, with some industrial policies).
You play dirty when comparing China and USSR with "many [unnamed] countries in the 3rd world". We are trying to discover the best way to build a society. You choose your best example, I choose mine, and lets compare. It's a low blow that you can choose your best example and pit it to my worst (and probably not even representative). It highlights that you are not after the truth. You just hate capitalism and need to manipulate reality to "win" internet arguments.
You are making some strawman argument here. You claim that capitalism doesn't innovate or take risks, something that nobody claims! Here goes an example: oil cracking was invented in Imperial Russia in 1891. But due to lack of capitalist institutions, it has no use for gasoline: it was a refinery dangerous waste. For 20 years they had an invention in a box that served no one. In 1913, the process was re-invented in the US and was immediately put to work towards humanity advance: gasoline was needed for cars. One invention calls for the other, all of them guided by profit seeking.
Capitalism doesn't claim to be the source of advances, inventions or discoveries. It claims to be the best known way to put those advances, regardless who made them, in use. It claims to be the best capital allocator in existence.
This stems from the optimal load of self-balancing trees.
A little bit of slack is always useful to deal with the unforeseen.
And even a lot of slack is useful (though not always as it is costly) as it enables to do things that a dedicated resource cannot do.
On the other hand, no slack at all (so running at above 70%) makes a system inflexible and unresilient.
I would argue for this in any circumstance, be it military, be it public transit, be it funding, be it allocation of resources for a particular task.
1 - e^(-1) ~= 0.6321
As e^x is a commonly occurring curve and at that point its derivative goes below 1, meaning from that point on it's diminishing returns.But you do not get good art by early stopping, you do not get it by injecting noise, you do not get it by regularization. All these do help and are essential to our modeling processes, but we are still quite far. We have better proxies than FID but they all have major problems and none even come close (even when combined).
We've gotten very good at AI art but we've still got a long way to go. Everyone can take a photo, but not everyone is a photographer and it takes great skill and expertise to take such masterpieces. Yet there are masters of the craft. Sure, AI might be better than you at art but that doesn't mean it's close to a master. As unintuitive as this sounds. This is because skill isn't linear. The details start to dominate as you become an expert. A few things might be necessary to be good, but a million things need be considered in mastery. Because mastery is the art of subtly. But this article, it sounds like everything is a nail. We don't have the methods yet and my fear is that we don't want to look (there are of course many pursuing this course, but it is very unpopular and not well received. Scale is all you need is quite exciting, but lacking sufficient complexity, which even Sutton admits to be necessary). It's my fear that we get too caught up in excitement that we become blind to our limitations. Because it's knowing those limitations that is what gives us direction to improve upon. When every critique is seen as spoiling the fun of the party, we'll never be able to have anything better. I'm not trying to stop the party, in fact, I'm worried it'll stop.
In the process, acting somewhat like a generalization of the problem it describes: overly precise and narrow approaches to "improve" ineffable qualities. But the author seems to understand that - he comments on the absurdity of some direct transfers of ML methods to real world problems. I think he just added a bunch of not necessarily well solvable, but particularly suffering from "overfitting", example problems. It's a food for thought article, not a grand proposal.
Evolution also picked it up as "satiation" - eating icecream feels good however you can't keep eating 1 per minute, same with pretty much everything.
In art it means not hijacking everything for some local maximum.
But also fair point. I think we should all have a contingency plan I'm case of death, regardless of where our stuff is hosted. Self-hosted stuff indeed becomes a ticking time bomb after death. Even on 3rd party services, it's apparently a nightmare trying to get access to a deceased person's Google account, where Google Sites may live, etc.
"Ahh, if only I hyperoptimize all aspects of my existence, then I will achieve inner peace. I just need to be more efficient with my time and goals. Just one more meditation. One more gratitude exercise. If only I could be consistent with my habits, then I would be happy."
I've come to see these things as a hindrance to true emotional processing, which is what I think many of us actually need. Or at least it's what I need - maybe I'm just projecting onto everyone else.
Hell, even this settling for happiness as a side-product is a result of the judgement that this is the best we can do regarding the goal of happiness.
1: Healthcare efficiency is measured by "completed tasks" by primary care doctors, the apparatus is optimized for them handling simple cases and they thus often do some superficial checking and either send one home with some statistically correct medicine (aspirin/antibiotics) or punt away cases to a specialized doctor if it appears to be something more complicated.
The problem is that since there's now fewer of them (efficient) they've more or less assembly line workers and have totally lost the personal "touch" with patients that would give them an indication on when something is wrong. Thus cancers,etc are very often diagnosed too late so even if specialized cancer care is better, it's often too late to do anything anyhow.
2: The railway system was privatized, considering the amount of cargo shipped it's probably been a huge success but the system is plagued by delays due to little gaps in the system to allow late trains to speed up or to even do more than basic maintenance (leading to bigger issues).
>When a company grows big enough, they want to replicate their initial success. They all thought about the "process" of how the first success was created. So they replicate those "process" across the company. And before very long the people confused that the process was the content.
And you can fit that from small companies to the world's largest government. Most of them forgot about their content.
Perhaps the answer—as hippy sounding as it is—is to reduce the control of the system outright. Instead of adding more measures, more controls, which are susceptible to the prejudices of control, we let the system fall where it may.
This, to me, is a classic post of an academic understanding the failures of a system (and people like themselves in control of said system) but then not allowing the mitigation mechanisms of alternate systems to take its place.
This is one of the reasons I come to HN: to view the prime instigators of big-M Modern failure and their inability to recognize their contributions to that problem.
That's not right. The primary task of management is alignment.
Fair enough.. at least they think they can add value by improving efficiency.
> In machine learning (ML), overfitting is a pervasive phenomenon. We want to train an ML model to achieve some goal. We can't directly fit the model to the goal, so we instead train the model using some proxy which is similar to the goal
One of the pernicious aspects of overfitting is it occurs even if you can perfectly represent your goal via a training metric. In fact it's even worse simetimes as an incorrect training metric can indirectly help regularise the outcome.
The author seems to be discussing optimizing for the wrong metric. That's not a problem of too much efficiency.
Excessive efficiency problems are different. They come from optimizing real output at the expense of robustness. Just-in-time systems have that flaw. Price/performance is great until there's some disruption, then it's terrible for a while.
Overfitting is another real problem, but again, a different one. Overfitting is when you try to model something with too complex a model and and up just encoding the original data in the model, which then has no predictive power.
Optimizing for the wrong metric, and what do about it, is an important issue. This note calls out that problem but then goes off in another direction.
All metrics are wrong, some metrics are useful. Finding the useful one and then recognising when it ceases to become useful is the hard problem.
If we squint a little, focus on close/far-away instead of same/distinct and s/metric/model/g (because usage of a metric implies a model), we can see how close these things can be.
Optimizing for the wrong metric - becomes “using a wrong model”.
Excessive efficiency - is partially “using a wrong model”, or maybe “good model != perfect model”. We start with good enough model, but after certain threshold we get to experience the difference between “good enough” and “perfect” (aparantly we care about redundancy, but it was not part of our model; so we were using a wrong model)
Overfitting is “finding the wrong model” (I wanted a model for the whole population, got a model only for a sample)
..or if we squint even more and go meta.. overfitting is part of “good model != perfect (meta)model” of modeling. (using sample data is good enough, but not perfect)
P.S. I liked the article. Choice of the title - not so much.
P.P.S. Simplicity of a model is part of meta-model.
Invented the first generative diffusion model in 2015. https://arxiv.org/abs/1503.03585
I'll also quibble with the example of obesity: the proxy isn't nutrient-rich good, but rather the evaluation function of human taste buds (e.g. sugar detection). The problem is the abundance of food that is very nutrient-poor but stimulating to taste buds. If the food that's widely available were nutrient-rich, it's questionable whether we would have an obesity epidemic.
Carbohydrate abundance was likely important in moving people out of hunger and poverty but excesses of the same kind of diet are a reflection on obesity.
My guess is that calorie-per-gram-per-dollar of carbohydrates is still lower than fat and protein.
The female peacock is using the make peacock’s tail as a proxy for fitness - with beautiful consequences, but the males with the largest, showiest tails are clearly less fit, and more prone to predation.
Or put another way: someone who wins the Olympic 100m sprint while hopping on one leg is a better runner that everyone else in the race by a wide margin.
The female peacock is using the make peacock’s tail as a proxy for fitness - with beautiful consequences, but the males with the largest, showiest tails are clearly less fit
In short, efficiency is fragile. If you want your thing to be be stronger after a shock (instead of falling apart), you must design it to be antifragile.
Note: it's hard to build antifragile physical things or software, but processes and organizations are easier. ML models can be antifragile if they're constantly updating.
Not a goal for me, and not for evolution. Survival, health, prosperity, thriving and complexity rate higher. Not everyone makes it.
"HI! My name is Tracy! I'm going to be your server this evening!" as she flawlessly writes her name upside down in crayon on the paper tablecloth. Woah. I think this place needs to re-calibrate their flair.
I have been thinking of goodhart's law a lot, but realized I had been leaning toward focusing on human reaction to the metric as the cause of it; but this reminded me it's actually fundamentally about the fact that any metric itself is inherently not an exact representation of the quality you wish to represent.
And that this may, as OP argues, make goodhart's law fundamental to pretty much any metric used as a target. Independently of how well-intentioned any actors. It's not a result of like human laziness or greed or competing interests, it's an epistemic (?) result of the neccesary imperfection of metric validity.
This makes some of OP's more contentious "Inject noise into the system" and "early stopping" ideas more interesting even for social targets.
"The more our social systems break due to the strong version of Goodhart's law, the less we will be able to take the concerted rational action required to fix them."
Well, that's terrifying.
Randomly chosen deliberative bodies could keep some of the stupid polarization in check, especially if your chances to be chosen twice into the same body are infinetesimal.
https://en.wikipedia.org/wiki/Sortition
We tend to consider "democracy" as fundamentally equivalent to "free and fair elections", but sortition would be another democratic mechanism that could complement our voting systems. Arguably more democratic, as you need money and a support structure to have a chance to win an election.
An overemphasis on grades isn't from wanting to educate the population; obesity isn't from prioritizing nutrient-rich food; and increased inequality isn't from wanting to distribute resources based on the needs of society.
Living a well-lived life through culture, cooking, or exercise doesn't make you more susceptible to sensationalism, addiction, or gambling. It's a lack of stimulus that makes you reach for those things.
You can argue that academia enables rankings, industrial food production enables producing empty calories, and economic development enables greater inequality. But that isn't causation.
It also isn't a side effect when significant resources specifically go into promoting education as a private matter best used to educate the elite, that businesses aren't responsible for the externalities they cause, and that resources should be privately controlled.
In many ways, it is far easier to have more public education, heavily tax substances like sugar, and redistribute wealth than it is to do anything else. That just isn't the goal. It used to be hard to get a good education, good food, and a good standard of living. And it still is. For the same reasons.
However: 1) This exact scenario will likely never materialize 2) You have not good quantification of the scenario anyway due to noise/biases in measurements.
So now you optimized for something very specific, and the nature throws you something slightly different and you are completely screwed because your optimized solve is not flexible at all.
That is why a more “suboptimal” approach is typically better and why our stupid brains outperform super fancy computers and algorithms in planning.
Perhaps it is interesting to read his blogpost "Machine Learning has a validity problem" alongside this article.
[1] https://www.incontrolpodcast.com/
[2] https://archives.argmin.net/2022/03/15/external-validity/
The subtle difference between the two being exactly what the author describes: Goodhart's law states that metrics eventually don't work, Campbell's law states that, worse still, eventually they tend to backfire.
As just one example to make this point more concrete (LOL), the article mentions uncritically that "more complex ecosystems are more stable", but over half a century ago in 1973 Robert May wrote a book called "Stability and Complexity in Model Ecosystems" [2] explaining (very accessibly!) how this is untrue for the easiest ideas of "complex" and "stable". In more human terms, some ideas of "complex" & "stable" can lead you astray, as has been appearing in the relatively nice HN commentary on this article here.
Perhaps less shallowly, things go off the rails fast once you have both multiple metrics (meaning no "objective Objective") and competing & intelligent agents (meaning the system itself has a kind of intrinsic complexity, often swept under the rug by a simplistic thinking that "people are all the same"). I think this whole topic folds itself into "Humanity Complete" (after NP-complete.. a kind of infectious cluster of Wicked Problems [3]) like trust/delegation do [4].
[1] https://en.wikipedia.org/wiki/Self-organized_criticality
[2] https://press.princeton.edu/books/paperback/9780691088617/st...
> we provide necessary and sufficient conditions under which indefinitely optimizing for any incomplete proxy objective leads to arbitrarily low overall utility
> Our main result identifies conditions such that any misalignment is costly: starting from any initial state, optimizing any fixed incomplete proxy eventually leads the principal to be arbitrarily worse off.
[0]: https://proceedings.neurips.cc/paper/2020/hash/b607ba543ad05...
There's a reason why the best-seller of self-help book for several decades now is the book by Stephen Covey entitled "The 7 Habits of Highly Effective People" not "Efficient People".
https://www.lesswrong.com/posts/yLLkWMDbC9ZNKbjDG/slack
Also, can't recall it but a long time ago I read a piece about how scheduling a system to 60% of its max capacity is generally about right, to allow for expected but unexpected variations (also makes me think of the concept of stochastic process control and how we can figure out the level of expected unexpected variations, which could give us an even better sense of what %-of-capacity to run a system at)
I don’t know if this phenomenon is aptly characterized as “too much efficiency”.
> when a measure becomes a target, it ceases to be a good measure
If you ask someone "could you give me an example" you will see that in the example the measure that becomes a target is already a proxy. Even the example that the author presents, the good that cares a lot about testing its students... How does the school test its students? With exams. But that's already a proxy for testing students knowledge...
But overall excellent article.
The "examples abound, in politics, economics, health, science, and many other fields" isn't a relationship between efficiency and outcome, but rather measuring and efficiency, or measuring and outcome. I think a better analogy is Heissenberg's uncertainty principle – the more you measure the more you (negatively) affect the environment you're measuring.
Robust systems minimize fault points. Efficient systems come at the cost of robustness, and vice versa given a fixed definition of what is being conserved, i.e. costs or energy.
For example, a four cylinder engine that gets 15mpg will have a longer life than one that gets 30mpg, given the same cost.
https://herbertlui.net/slack-tom-demarco-summary/
“People under time pressure don’t think faster!”
Head and hands need a mediator. The mediator between head and hands must be the heart! - movie "Metropolis"
The ideal choice would be a random number generator, but lacking that, he would want to inject the greatest dose of entropy available into the system.
Proxy: minimizing execution time of hot loops
Strong version Goodhart's: applications get incredibly bloated and unresponsive
The basic problem is stupid simple. Optimizing a process for one specific output necessarily un-optimizes for everything else.
Right now much of commerce and labor in the United States is over-optimized for humans because tech businesses are optimizing for specific outcomes (productivity, revenue, etc) in a way that ignores the negative impacts on the humans involved.
The optimizations always turn into human goals, eg my manager needs to optimize for productivity if they want a bonus (or not get optimized out themselves), which means they need to measure or estimate or judge or guess each of their employees’ productivity, and stupid MBA shit like Jack Welch’s “fire the lowest 10% every year”) results in horrible human outcomes.
Sure there are people who need to be fired, but making it an optimization exercise enshittified it.
Same for customer service. Amazon wants to optimize revenue. Customer service and returns are expensive. Return too many things? You’re fired as a customer.
Call your mobile providers customer service too often? Fired.
Plus let’s not staff customer service with people empowered to do, well, service. Let’s let IVRs and hold times keep the volumes low.
All anecdotes but you’ve experienced something similar often enough to know it is the rule, not the exception, and it’s all due to over-optimization.
--- Goal: Healthy population Proxy: Access to nutrient-rich food Strong version of Goodhart's law leads to: Obesity epidemic
I'm not sure I believe this one. Exactly who's target is "access to nutrient-rich food" and how would removing that target fix the US obesity epidemic? Is "nutrient-rich" a euphemism for high-calorie? My understanding is that there are plenty of places with high-nutrient food but different norms and much better health (e.g. Japan).
We can and do measure population health across (including obesity), this isn't a proxy for an unmeasurable thing.
--- Goal: Leaders that act in the best interests of the population Proxy: Leaders that have the most support in the population Strong version of Goodhart's law leads to: Leaders whose expertise and passions center narrowly around manipulating public opinion at the expense of social outcomes
Is this really a case of "overfitting from too much data"? Or is this just a case of "some things are hard to predict?" Or even, "it's hard to give politicians incentives." It'd be interesting if we gave presidents huge prizes if the country was better 20 years after they left office.
--- Goal: An informed, thoughtful, and involved populace Proxy: The ease with which people can share and find ideas Strong version of Goodhart's law leads to: Filter bubbles, conspiracy theories, parasitic memes, escalated tribalism
Is "the goal" really a thoughtful populace? Because every individual's goal is pleasure, and the companies goals are selling ads. So I don't know who's working on that goal.
This whole thesis easily tips over into a semantic gobbledygook, as efficiency is not a property of the larger world, but an utter contrivance of thought.
Focus on anything to the exclusion of everything use and things are going wrong. How has the obviousness of this observation has turned into a breakthrough? AI is the perfect nexus for such a discovery: trying to optimize a system when you don't understand how it works naturally has pitfalls.
So what can it mean to try to mathematically formalize a misunderstanding? Maybe there's a true breakthrough lurking near this topic: that all understanding is incomplete, so look for guiding principles of approximation?
The author is right to call out the forest for the trees.
—
Web Design: The First 100 Years
https://idlewords.com/talks/web_design_first_100_years.htm
How the SR71 Blackbird Works
That said, the post is still valuable and would work much better with a framing closer to "some analogies between statistical analysis and public policy" -- the rest of the post (all the political recommendations) is honestly really solid, even if I don't see a lot of the particular examples' connections to their analogous ML approaches. The creativity is impressive, and overall I think it's a productive, thought-provoking exercise. Thanks for posting OP!
Now, for any fellow pendants, the philosophical critique:
more efficient centralized tracking of student progress by standardized testing
The bad part of standardized testing isn't at all that it's "too efficient", it's that it doesn't measure all the educational outcomes we desire. That's just regular ol' flawed metrics. This same counterintuitive relationship between efficiency and outcome occurs in machine learning, where it is called overfitting.
Again, overfitting isn't an example of a model being too efficacious, much less too efficient (which IMO is, in technical contexts, a measure of speed/resource consumption and not related to accuracy in the first place).Overfitting on your dataset just means that you built a (virtual/non-actual) model that doesn't express the underlying (virtual) pattern you're concerned with, but rather a subset of that pattern. That's not even a problem necessarily, if you know what subset you've expressed -- words like "under"/"too close" come into play when it's a random or otherwise meaningless subset.
I'm not allowed to train my model on the test dataset though (that would be cheating), so I instead train the model on a proxy dataset, called the training dataset.
I'd say that both the training and test sets are actualized expressions of your targeted virtual pattern. 100% training accuracy means little if it breaks in online, real-world use. When a measure becomes a target, if it is effectively optimized, then the thing it is designed to measure will grow worse.
I'd take this as proof that what we're really talking about here is efficacy, not efficiency. This is cute and much better than the opening/title, but my critique above tells me that this is just a wordy rephrasing of "different things have differences". That certainly backs up their claim that the proposed law is universal, at least!These ones:
Goal: Educate children well Proxy: Measure student and school performance on standardized tests Strong version of Goodhart's law leads to: Schools narrowly focus on teaching students to answer questions like those on the test, at the expense of the underlying skills the test is intended to measure
--- Goal: Rapid progress in science Proxy: Pay researchers a cash bonus for every publication Strong version of Goodhart's law leads to: Publication of incorrect or incremental results, collusion between reviewers and authors, research paper mills
--- Goal: A well-lived life Proxy: Maximize the reward pathway in the brain Strong version of Goodhart's law leads to: Substance addiction, gambling addiction, days lost to doomscrolling Twitter
--- Goal: Healthy population Proxy: Access to nutrient-rich food Strong version of Goodhart's law leads to: Obesity epidemic
--- Goal: Leaders that act in the best interests of the population Proxy: Leaders that have the most support in the population Strong version of Goodhart's law leads to: Leaders whose expertise and passions center narrowly around manipulating public opinion at the expense of social outcomes
--- Goal: An informed, thoughtful, and involved populace Proxy: The ease with which people can share and find ideas Strong version of Goodhart's law leads to: Filter bubbles, conspiracy theories, parasitic memes, escalated tribalism
--- Goal: Distribution of labor and resources based upon the needs of society Proxy: Capitalism Strong version of Goodhart's law leads to: Massive wealth disparities (with incomes ranging from hundreds of dollars per year to hundreds of dollars per second), with more than a billion people living in poverty
---
I will start:
Goal: Leaders that act in the best interests of the population
Good proxy: Mandate that local leaders can only send their kids to the schools in their precinct. They can only take their families to the hospitals in their precincts.
1. That the only input to the system is cost/money (or proxies of that, like compensated human time). Put another way: That the model you're working with is perfectly liquid, and you don't need to worry about fundamental supply constraints.
2. That the loss is truly loss, and there isn't some knock-on effects from that loss which might range from generally beneficial and good, to actually being somewhat responsible for the output metric, and your model is measuring the wrong thing.
3. That the output metric correctly and holistically proxies for the real-world outcomes you desire.
Using the example from the article on standardized testing: A school administration might make an efficiency argument by comparing dollars spent to standardized test scores.
* Dollars isn't the only input to this system, however; two major ones also include the quality of teachers and home life of the students. Increasing the spend of the system might do nothing to standardized test scores if these two qualities also can't be improved (you might make the argument that increasing dollars attracts better teachers, and there's some truth to this, but generally (even in tech) these two things just aren't strongly correlated; many organizations have forgotten what it even means to be "good at your job" and how to screen for quality in interviews. When organizations lose that, no amount of money can generate good hires because the litmus test doing the hiring is bad).
* "Loss" in this system might be the increase of funding without seeing proportionally increasing test scores; which does not account for spending money in extracurriculars like music, art, and sports; all generally desirable things we believe money should be spent on (isn't it interesting that we call these things "extra"curriculars?).
* Even if a school administration can apply this model to increase test scores, increasing test scores might not be an outcome anyone really wants. As the article says, all that guarantees is a generation of great test-takers. Increasing college acceptance rates? We've guaranteed a generation of debtors and bad degrees. Turns out, its impossible to proxy for the real world thing you want, in a way that can be measured on a societal level.
All of this is really just symptoms of the "financialization of everything", which has been talked about endlessly. In particular to this discussion, society has broadly forgotten about what the word "service" means; that public transit in your city must be a capitalistic enterprise, it itself has an efficiency metric that must be internally positive, because the broader positive efficiency impact that public transit network has on the people and businesses in the city, and thus municipal tax income, is too complex to account for within a more unified economic model.