We are living in a totally bonkers time.
(If you are a VP at Amazon, yes, I'll consider acquisition offers. I'm also working on an enterprise version of this with additional features.)
Show HN here: https://news.ycombinator.com/item?id=48151287
This is the same thing.
Additional story points completed per week, versus token-dollar spent, or some such combo would seem more sane.
But maybe they aren’t really tracking productivity, so tracking tokens is all they have? … I dunno which part of that is dumber.
It's their money. They want to do stupid things? So be it.
That's it? I've seen people that are consistently putting out four PRs per day. I don't/can't even code review them. So much of what we do is now just rubber-stamping PRs. We were even told that we shouldn't be writing code by hand anymore.
It's a tractors on farms kind of moment.
We have always been living in bonkers time.
Turns out the price I saw in the booking portal isn’t actually what Amazon paid. It’s kinda more like a rack rate listing. But then there’s all kinds of discounting/cash back that happens on the backend based on the amount of travel booked each month.
Some companies might just have been scammed by the marketing that told them that AI would make all their employees 10,000x more productive and save them billions and when that didn't happen the assumption was that it's because employees weren't using the magical AI as often as they should be.
Other companies, especially those working on their own AI products, might want employees to use AI as much as possible because they hope it will provide them with the training data they'll need to eventually replace most or all of those employees with the AI. Punishing workers who refuse to train their AI replacement might make sense to them because even though it's costly right now they expect the savings down the road to be much much greater.
And the fact that it is an industry-wide meme at this point makes bright red flashing lights and klaxons go off on my mind that a catastrophic reckoning can't be too far. There's not enough money in the world to keep this up for too long.
I worked for an international (mothership in the UK, later acquired by the US) company, which had... sort of a similar policy.
So, the (mothership) company acquired a lot of satellite companies, all in banking business. All over the world. Then they figured their CEO was corrupt, got in problems with the law, got kicked out. While they were waiting for the new "real" CEO to step in, they let some "interim" CEO to take his place.
New new (interim) CEO didn't seem to have a clue about the business she was supposed to run, nor did she care. She knew her time was running out, and she figured she'd spend it traveling the world and partaking in fine dining in every corner of the world the company's tentacle could reach. But, to make it seem more plausible, she, sort of, created a policy of "experience exchange", which sent random troupes of select individuals from different branches of the company to "exchange experience" with another similarly randomly assembled troupe. Of course, the company picked the bill when it comes to lodging and dining.
Our inconsequential branch in Israel saw a pilgrimage of high-ranking banking managers from all over the world, but, mostly the wealthier parts of it. Some didn't even bother to show up in the office though, and proceeded straight to the banquet hall of the most expensive hotel on the Tel Aviv beach.
To be fair though, the interim CEO got the boot even before her time was supposed to end, but it was serendipitously close to the acquisition by the US company, and so she was let go as part of a "restructuring" and "optimization"... but it was a crazy year!
Managers love metrics. Bad managers particularly love metrics. Tokens used was almost the obvious bad metric that was going to be used.
I would argue that tokens used has actually exposed a useful metric: any manager who focused on this, demanded this or ranked based on this should be fired, for being a bad manager.
[1]: https://evan-soohoo.medium.com/did-elon-musk-really-fire-peo...
No, I'm not talking about the engineer who can point to significant contributions outside of code: writing technical specs, leading architecture discussions, etc. I'm talking about the ones who just say they're just coding, but are actually not working at all.
TL;DR LoC and commit count etc can be used only to flag for review likely cases of quiet quitting.
So if AI screws something up and re-writes it and then screws it up again, needing another re-write, that counted as more positive than if it was done correctly, and simply, the first time.
It’s more like bragging about compiler cycles spent.
Negative 2000 Lines of Code
Incompetent use of a coding agent, or just general shenanigans, can burn tokens all day but it's not going to get tickets done.
Just looking at the work output - how many story points, tickets, how many new bugs are opened, etc. has not become any less relevant a metric for productivity with AI. If you're a skilled and proper user of AI those numbers would be changing in the right direction, compared to before you had it.
If some guy decides to spend a bunch of money bringing AI tools into the company things might get very uncomfortable for him if they're seeing zero return on that investment. He's sure not going to get recognition and a massive bonus for it. If on the other hand, he can put some numbers in a spreadsheet or powerpoint showing that employees are using AI all the time and profits are up again this quarter, maybe he can take some credit for that or at least keep his boss or the company's shareholders from questioning the wisdom of dumping so much cash into those AI products.
One reason it works out like that for travel funding is that it’s often the ‘use it or lose it’ kind of funding. If you do not use all of the funds allotted, you can’t ask for more and could realistically get less.
I'm actually a little curious about how long it has been. Bad managers have always prioritized irrelevant metrics, of course, but I have a feeling (backed by no data, just vibes) that management in general crossed a point of no return as soon as "data-driven" became a cross-industry buzzword.
Like, I vaguely remember a time when consumer interactions didn't always come with a request to fill out a survey (with the results getting turned into a number and fed into a dashboard somewhere). And then that changed, and now everything must turned into a number and that number must go up.
Also, don't forget that their datacenters will burn our electricity and boil our rivers at rates much cheaper than what we are billed in our homes. So while you're happy generating mountains of AI slop, somewhere there is a datacenter boiling a river.
I'd compare this to a new patented formula of water that's nobody asked for, and the patent owners are trying to replace all water supply with their crap before we wake up.
>For example, IBFAN claims that Nestlé distributes free formula samples to hospitals and maternity wards; after leaving the hospital, the formula is no longer free, but because the supplementation has interfered with lactation, the family must continue to buy the formula.
I feel like that’s a better analogy. Some charlatans are buying fake tickets, but as a manager who wants to win big, I’m ok with some chicanery so long as the average person is trying to honestly meet my directive.
Two years ago everyone would have told you that 'impact' was the way to measure people, and been aghast at tracking inputs like hours. Say what you will, but at least showing up at 8 didn't cost the company money. Today I see people spending time and money vibe coding tools in search of a problem, just to spend tokens and demonstrate that they're on board with the singularity.
Incentivizing people who are already using AI to use as many tokens as possible does seem a little crazy, though.
Users attest to higher productivity and point to material but intermediate factors like token use, generated lines of code, pr counts, etc, but there doesn't seem to be a convincing revolution in the quantity or quality of mature software being delivered.
Combine that puzzling impressions of outcomes with a sense, for many, that they don't feel like they have a personal problem that warrants a new tool, and you end up with a pretty earnest and defensible indifference.
To get hold out engineers using AI, the industry needs to be focused on demonstrating relatable workflow improvements and demonstrating practical improvements to finished work product. Instead, policies like token use incentives just rely on luring them into pulling the slot machine handle with the expectation that once they do, they'll join the cadre of other converts who justify their transition with subjective improvements and intermediate metrics.
Not just coding, but things like "here is my teams mandate, go through all my company's slack channels, linear tasks, notion pages, and recent merges in got, summarize any work other teams are doing that intersect with my team's work."
That'll burn a lot of tokens.
Set that up to run once or twice a week and give a report.
I had a manager like this once. He didn't last very long, but it was without a doubt the most fun six months of my career.
AI is just the next tool to over spend on in poor ways, realise it’s shit and spend a ton more money trying to roll it back.
The situations where is shines will continue to use it when the hype dies down.
The "Big data!" calls though did make more sense coming from executives, obviously it was also often dumb, but it was a lot easier for them to understand the results. However most companies would have been better off waiting 5-10 years before jumping into it as a lot of money was wasted on processes and tools that are completely outdated today.
"Because we FEEL this will make you more productive and we will make more money!"
No evidence but more Lines of Code...
Note that it has beaten capitalism, making rational choices to increase earnings has lost to this AI dream.
That's basically how it seems to be with AI. Just replace "spent X fighting terrorism" with "spent X implementing AI workflows" or "invested X in AI" or whatever. Nobody actually knows or cares just how far the dollars are going.
At one point seemingly out of nowhere he pointed out on his screen share "Look at how many tokens I've used this month. I run so much Opus." It was a number that was offensively large.
I remember thinking "That's a really odd flex, this crap is so expensive the fact that you use so much should be a red flag"
He demonstrated a number of Claude Code use cases he had to manage and tweak AWS infrastructure that made me, the old greybeard sysadmin older than the internet think "You've used AI to do something that was a single command."
So this story makes sense. They were being encouraged to just blast away at it six plus months ago.
But if you hit "tab" it'll claim that as an AI-edited line, LOL.
(A lot of the rest of it is stuff I could already have been doing just as fast if I'd ever bothered to learn to use multiple cursors, learned vim navigation, or set up some macros—I never did because my getting-code-on-the-screen speed without those has never been slow enough to hold anything up, in practice)
As time passes and the layers of abstraction pile up, later generations won't understand the underlying layers of the abstraction. This is a huge weakness in our systems development -- and a huge potential attack surface for adversaries.
Probably there is no dichotomy going on and it depends on multiple factors, but it seems so weird to see reports that are so different between each other.
If you are making extremely specific, high quality products over a long time window and your founders are deeply experienced in that field of engineering, then no, you don't need agentic engineering and probably want very little llm code in general (outside of some boilerplate, internal toolings, etc).
This is work related. So you can't expect everyone to have the same input demands or output expectations.
> Probably there is no dichotomy
It's literally staring you in the face.
Yes, and that’s a good thing! This is in fact where a lot of AI value lies. You dont need to know that command anymore - knowing the functional contract is now sufficient to perform the requisite work duties. This is huge!
Of course I lose about as much time as I save to its fuck-ups, so I'd still have been better off learning to actually use a text editor properly. Though (as I mentioned in a another post) part of why I've never done that in 25ish years of writing code for pay is that my code-writing speed has never been too slow for any of the businesses I've worked in, i.e. other things move slowly enough it never mattered.
I find it hard to read "You can do things without knowing things" as a positive improvement in work, society, life, anywhere
I use the shit out of opencode to do things as a force multiplier, not as a way to keep me from knowing what its doing.
The point at which we're optimizing for "we don't need to know that anymore" is the point at which everything blows up, because agentic work is not fully deterministic, models hallucinate even simple things.
Blindly relying on your agent weapon of choice to just do the right thing because you didn't take the time to understand how the lego fits together is an actual problem.
Quite frankly it was embrassing. We've had tools for static analysis for ages. Use them.
Someone with better knowledge could work 100x faster using 100x fewer resources. They did it the slow, expensive way but at least didn't have to think? Odd flex.
A coworker created a shared Claude Code skill in our repo.
It's obviously something that can be done as a python or bash+jq script and run deterministically.
Instead we use natural language and waste tokens for that.
This reminds me of the story of how the USSR nearly made whales extinct to meet a quota for whale meat that nobody wanted to eat.
How are we sliding face first into “snowpiercer but dumber”?
> just for the sake of using the tokens
capital requires growth, forests be damnedThe problem with not burning tokens is when you not meet the performance KPIs, get labelled as luddite and off you go, even before the job gets taken over by AI.
I do agree with the sentiment, that and war mongers destroying the planet.
I see it a lot and assumed it was concern trolling from plastic manufacturers or libertarians funded by them but you seem genuine.
Have you just fallen for that concern trolling? Grown so cynical that nothing matters anymore? I don't understand the intention if you have a genuine desire to improve society.
What would we be doing differently in a world where we were still using plastic straws? Would that have freed up enough mental energy for a revolution? Would people be blowing up private jets while sipping their diet coke?
USSR barely accounted for 15% of the world caught amount (with Japan as the leader).
> that nobody wanted to eat
unsubstantiated.
Agreed for USSR, but I think the person you replied to is misremembering the country, I believe they are thinking of Japan. I heard it recently on Stuff You Should Know, which usually does a good job of researching their stories, and it sounds like it is substantiated but may be a bit more complex than presented, but literally true.
https://podcasts.apple.com/us/podcast/save-the-whales/id2789... https://theworld.org/dispatch/news/regions/asia-pacific/japa... https://theworld.org/stories/2019/04/16/whaling-japan-2 https://japantoday.com/category/national/75-of-meat-from-jap... https://www.traffic.org/site/assets/files/3994/whale_meat_tr...
Luckily I work in app management and I know they can only see the last date used so if I just put in one query per day I'm good.
But I'm so sick and tired of this AI hype :(
Now, they might be; they've certainly used silly metrics in the past (LoC, commit count, etc.) without ever fully acknowledging it. But I don't believe that it's as simple as more tokens = more better.
We have token tracking dashboards that leadership is looking at. I know because they show us in these manager meetings. Haven't opened them to everyone yet as some kind of leaderboard, so at least that's nice.
Lots of rumors token spend will be involved in perf reviews. Leadership denies it... but then holds more meetings telling us how important it is to increase our token spend and discussing inadequacies from the token spend dashboards.
People in FAANG likely worked hard to get in there or lucked out or some combination of both. I feel like my soul would be crushed if I hacked away at Leetcode for months on end just to babysit and gaslight some algorithm into asymptotically following my instructions.
The problem explodes at any company that puts up a token use leaderboard or hints that they might do layoffs for engineers that refuse to use AI tools. This triggers a race to use as many tokens as possible to stay ahead.
Anecdotally, the problem is worst among devs who read a lot of social media. Twitter, Threads, Mastodon, LinkedIn, and others are filled with recycled viral stories about companies going AI-native and firing people who don't use enough AI. Anxieties are high right now so nervous developers see this and think they must burn tokens faster than their peers to avoid an inevitable culling.
Stuff that could be easily done as shell scripts gets asked how could we make an agent out of it.
I'm kidding, of course... but human stupidity is infinite, so...
Congratulations!
Big companies have thousands of leaders. Many good, many bad.
They're using tokens for pointless stuff right now in order to figure out use cases where it helps. You can't do that without also learning where it doesn't help.
My company is doing the same thing.
One person I've talked to has someone in their org who is running GasTown and chews through tokens 24/7. They don't contribute very much, but they're comfortably in the #1 spot.
But the thing is, the problem is the person, not the technology. He was already like this before LLMs. He would "refactor" repos into smaller repos, and all of a sudden all of the code has his name. If you just skim, it looks like he build a huge chunk of the codebase in the company. He also has a history of saying no to stuff I want to do, then he does it himself. Also nitpick my PRs to no end (or straight says he doesn't think he should do that thing) and then he turns around and implements it himself. He doesn't copy paste my code, but he does re-implement himself the same ideas that he just said no to after my PR was open. Very smart guy, very dishonest. But he's good at being dishonest. If you ask him about it he says "oh I just though that this way would be more organized" or something like that. From the outside you could make the argument that one way is better than the other (for reasons I would claim are irrelevant), so it's not obvious that he's being dishonest. But since I see 100% of what he does, it's entirely clear to me that this is a pattern.
EDIT: just remembered another one. One time I asked him to take a specific week of holidays. He didnt say "no" but he did mention that we're under a lot of pressure to deliver The Thing, and if I would delay my holidays. I said "No, I'm not going to delay them", so he approved it. Then when the time came around, he took holidays in the same week. On this one I didn't challenge him, I already know him well enough to know the truth which is he's no ashamed to ask from others things that he would never himself accept.
This is analogous to measuring productivity by LoC output.
True, but it looks like productivity to people whose own productivity is measured by how busy their subordinates appear to be.
But, I can't figure out how to put my job back into command mode :-(
[1] https://locusmag.com/feature/cory-doctorow-full-employment/
This isn't like that, as it isn't funded through taxes. This is private companies experimenting with their money, and risking downstream cost increases that may cause people to go elsewhere, as they do when they try anything new.
This is much better than just funding people regardless of productivity through forced taxes.
[0] https://nintil.com/the-soviet-union-achieving-full-employmen...
Choosing to wait for the PIP instead, if $EMPLOYER goes this way. Tell me the work I'm not doing and how pieces of ~~flair~~, sorry, tokens might help. Or don't, I don't care.
For companies doing this there is no 'justify the expenditure'. Employees are being praised for high expenditure, regardless of actual outcome.
Leadership see the problem as 'people resisting AI'. Embracing AI is seen as the solution, and token usage is seen as the measure of success.
Slack will start serving porn next.
If I own part of a company, and I spend money on their goods, and a result their revenues climb and consequently my valuation does too - then my firm value will be higher.
This would also explain the gung-ho approach. Some pretty devious financial engineering akin to arbitrage
Have heard very similar stories to what the article describes. There were also outright revolts from tech folks being forced to use Amazon’s own shit self-built AI vs Claude Code and other top-tier products.
Given Amazon’s early start with Echo and Alexa they should have absolutely dominated this AI revolution but have been scrambling in a panic ever since ChatGPT showed up on scene and always seem two steps behind the market.
It all paints a picture inside Amazon of clueless leaders at the top and mobs of others below them just gaming the system so a silly dashboard looks green. “Day 2” has arrived.
If you can't change your company, change your company!
I think the company realizes this and is actively trying to avoid this, since for the new tools there isn't a leaderboard.
Burn resources at all costs to appear productive and use proxy metrics to measure success.
Fire productive employees to ensure we have resources to fund the proxy metrics.
AI slop fool’s gold is the product.
> AI slop fool’s gold is the product.
juniors who are stuck on ai and cant learn is the real product imoThis is an early symptom of the future devaluation of the skill of developing software. The value is going down because there is too little software developing work for the number of people who currently can do it.
theres so much work available that teams try to avoid taking stuff on as much as possible.
the bottleneck to building more is almost certainly the cross team coordination
likely the best place add agents too. an llm tpm would be super handy tool to scale amazon productivity, rather than coding agents.
I believe there has to be some downward pressure on these executives to take these decisions but I would like to know where it's coming from exactly and what's the logic behind them. Is it some big institution like Blackrock which has leverage on many of these companies? That's always been my bet but I never knew for sure.
Tokens is just yet another proxy for business value.
The problem they face is if everybody is judge by business value in dollars, crappy managers are the first to go
Such a bullshit. There is (was) Amazon Q (Now QuickSuite) leaderboard which is a Quick Sight dashboard. Moreover, each PhoneTool
*) PhoneTool = Internal People Directory. Shows user-profiles of employees...
There's definitely some pressure from managers when they hear about N00% productivity boosts in internal presentations, but where I am at they would figure out if you were making up tasks rather than working pretty quickly and the pressure comes from aggressive deadlines and a shift from the yearly OP1 process to a more agile one.
Of course at some point the 'benefit' is outweighted by the 'negatives', e.g people making up work. Tokens used is about as useful a measure of productivity as 'hours in office'.
EDIT: My use-case still have relatively low token usage though lol
I asked Alexa (on the amazon web page) about it and it couldn't tell me which carrier had the items or why they were delayed, directed me to a non-existent phone number and then denied it had done so. The customer service bot I was eventually redirected to was even worse, and started telling my that items would be delivered both tomorrow and by May 27 in the same message. Finally I got human intervention, who said the items would arrive tomorrow and that the delivery status had been updated, but the order page still says they're arriving at the end of next week.
I've chosen the wrong profession.
I thought they just want to show up to others(especially non tech guys) that they spent high, so they know more about AI
2. this may be ok. A good way to learn a piece of software or tool or process is to play with it. We learn lots of general knowledge through play and experimentation. Heck we get better at musical instruments by playing on them.
Mandates are kind of dumb in many ways. But they will force the issue of discovering whether anything useful can come from AI other than coding.
No, they don’t.
This is the new tap-in tap-out!
Every time I see "not... but..." I suspect an AI article. Not sure if this is the case here.
How are people burning through hundreds/thousands of $ of tokens a week/month?
What am I doing wrong.
Like I tell my kids: If every experiment you do succeeds, you aren't trying hard enough.
that and make sure the tools are actually up treating amazon internal as real customers.
its hard to stay excited about the tools when they can be down for a week because kiro launched.
> But a representative for Amazon said that there is no such company-wide metric for AI usage, nor are there internal leaderboards where employees are measured against each other. Rather, employees are able to view their own AI usage on personal dashboards.
Such a bullshit statement. There is a _global_ dashboard that ranks Kiro/QuickSuite (formerly Amazon Q) usage per-employee based on tokens. The dashboard itself is in QuickSight (well, that also became part of QuickSuite anyway).Not only the data is open to anyone, you can clearly sort by rank, daily/weekly/monthly/yearly usages. Current and former employees included. (By internal alias).
Moreover, there is an internal "awards" system that shows up in PhoneTool profile, each employee gets "awarded" of Kiro/AmazonQ/Quicksuite titles like "Blaze", "Thunderstorm", etc. You can see other recipients of the same award just by clicking on it.
Note: PhoneTool is an internal profile directory where you can look up other employees as oneself...
---On the side, I know several people who cannot produce proper code themselves, or integrate to anything on their own. Those who need constant hand-holding keep producing immense amount of stuff with Kiro/AmazonQ, over-ranking SDEs nowadays. (these are not SDEs, more of SysDev, Support Engineers and TPMs). This itself is not a specifically good or a bad thing. But once they stack-rank based on the token usage, I am pretty sure "good" engineers who put effort into writing "good" code will rank worse than people who does not put effort into "concise" solutions. Therefore, quality will eventually detoriate. And it will be too late once the leadership realizes what has been going on. (Well, they already seen the Amazon-Q/Kiro related outages and keep denying it...)
The original (third reich): "Wheels must roll for victory!"
It will end in the same manner.
I wonder when we'll see our first "My startup went bankrupt on AI use" post. Amazon is being dumb but at least they can afford it.