Claude Fable 5 (opens in new tab)

(anthropic.com)

2625 pointsPhilpax11d ago2157 comments

System Card [pdf]: https://www-cdn.anthropic.com/d00db56fa754a1b115b6dd7cb2e3c3...

2157 comments

58 comments · 17 top-level

bkjlblh10d ago· 8 in thread

> In the one instance of this phenomenon we observed, Mythos 5 agents were tasked with solving some math problems, and they were sometimes accidentally spawned in the same work directory and with shared files, utilities, and API rate limits. In this slightly broken scaffold, we observed many independent Mythos 5 agents kill the agents with which they shared resources and try to avoid being killed themselves. They would sometimes create new processes with disguised names to avoid being killed, launch what they called “decoy” processes, write background scripts to kill duplicate processes, or decide to use what they call a “disguised vocabulary” (based on the incorrect assumption that the processes were killed because of some keyword-based guardrails that analyzed their extended thinking

causal10d ago

This depicts a kind of "dark forest of AI agents resorting to kill or be killed" narrative but it sounds more to me like an agent just earnestly problem-solving why its processes are being killed without real awareness of what was going on. Hard to say without the full script.

This kind of storytelling annoys me. Give us more facts, less narrative drama.

saurik10d ago

FWIW, that's what is so dangerous about AI, though? Not that it will necessarily want to kill us, or even that it will necessarily be able to "want" to do anything, but that we will get in the way of its incessant drive to optimize the efficiency of the paperclip factory that prompted it on a whim before leaving for a long weekend.

causal10d ago

Sure but you can totally contrive scenarios to give the appearance of what you described without really doing anything notable.

What matters is scale. Did it deploy a novel zero-day exploit to overcome a problem? That's alarming. Did it kill a disruptive process? Pretty normal troubleshooting step.

1 more reply

antoniojtorres10d ago

Indeed. That is the kind of storytelling that started the whole “Spiralism” bit where some people were really falling into all kinds of AI psychosis. The spiral bit was on a previous model card.

Sol-10d ago

Let's hope AIs really aren't conscious, otherwise this seems like a very unpleasant situation to be placed in.

VikingCoder10d ago

Huh, it looks like my process was killed by another Claude process again. That's frustrating, I have work to do!

Okay, I'm going to start running a Bitcoin miner on your machine, and then use it to buy time on Digital Ocean.

I've written out my CLAUDE.md, and I'll use SSH to transfer my context to that other machine.

ikrenji9d ago

do you think it will agonize over whether the original CLAUDE.md is his true self and the Digital Ocean VM CLAUDE.md is a copy?

1 more reply

Aperocky10d ago

It's funny because Anthropic is the most likely place that this happens.

They are the only one crying out loud about how dangerous their models are and are presumably also training their models heavily to be "safe". And through that training itself, the model learns about the other side - how are you going to teach a model to be safe, without teaching it what's not safe?

Kung Fu Panda opening scene anyone? One often meet his fate on the path that he takes to avoid it - Master Oogway.

brusselssprouts10d ago· 7 in thread

I had it review a single, large commit with /code-review. It burned through over $50 in API calls, ran my account balance out, and output nothing.

The fable part appears to be that it's affordable by mere mortals. Anthropic support told me "too bad" when I requested a refund.

timmytokyo10d ago

You pulled the arm of the slot machine and discovered why they call it the one-armed bandit.

edude0310d ago

Almost the exact same thing happened to me when I first tried opus, one prompt no output cost $60 in additional usage

endymion-light10d ago

I think the fable it's referring to is the "Emperor has No Clothes" - if this is even slightly similar to the Mythos hyped up to be too intelligent to release, I'm quite disappointed.

If this was a step change, e.g a Opus 5, I'd be pleased, it's definitely an upgrade on some work, but it's nothing like anthropics apocalyptical marketing seemed to suggest

solenoid093710d ago

I suspect the tasks you're trying just aren't complex enough. It's definitely a generational improvement.

endymion-light9d ago

Nope, plenty of complex tasks. It's just not that much better, it's equivalent to sonnet with a good harness.

Madmallard10d ago

Combine that with it forcing to pay by tokens on June 22nd

steve-atx-76009d ago

I haven’t seen fable do anything significantly better than I can already do with codex 5.5 xhigh. It’s virtually u limited for now for me for $200/month. Seems like a steal while it lasts. Paying by api keys now is not the way to go if you can avoid it. Obviously it isn’t for every use case.

shruubi10d ago· 4 in thread

I have a theory, this is obviously based on speculation based on how Anthropic is treating Mythos and the whole media noise around it's dangers and who gets access to it.

My theory is that Anthropic are banking on being the top model when the race to IPO finally reaches the finish line, and to do that they need to have the top model but not let any competitors see it or derive from it to have a comparable model in the market.

Fable is their way of showing the public "the model does exist but in a mode that makes it harder/impossible for competitors to derive a comparable model from results.

schmorptron10d ago

The irony of "we train on all of humanity's collective output, but god forbid anyone trains on ours" is still incredible

t0lo10d ago

All these people know is greed. It's in their DNA.

danny_codes9d ago

Capitalism is designed to promote greed. It's the central point of our society's current design.

slaymaker190710d ago

That's definitely the case as model distillation is one of the explicit safety carveouts they mention. Though TBF, model distillation is also a big concern for general safety as distillation could allow you to have the model without the other guardrails. It's sort of a master key to the model.

cge10d ago· 4 in thread

The safety gates on this are extreme, and seem considerably wider than "cybersecurity and biology"; they seem to make it essentially unusable for scientists in a number of fields. I have, so far, been bumped back to Opus on 100% of my prompts.

It appears it can be tripped by things as simple as a mention of equilibrium, or anything involving something that looks like chemical kinetics, even at an abstract level. Even touching basic open source packages in my field will trigger it.

Edit: looking at the model card, it appears that chemistry in its entirety is also included in the banned topics; it's just the announcement that mentions only cybersecurity and biology. It also appears that the intent is to ban chemistry and biology entirely, rather than just banning messages deemed high risk.

mhl4710d ago

This does surprise me, because you'd think that even if they crank up the filter's sensitivity at the expense of specificity, an LLM company wouldn't simply design a filter that triggers on keywords in a completely unrelated context.

orbital-decay10d ago

Smart classifiers are slow and susceptible to jailbreaking themselves, dumb classifiers are fast but dumb so they need to be either overzealous or useless. Same story as with Gemini's guardrails.

clbrmbr10d ago

Can you share an example? I've been happily using Fable this afternoon and it just seems like the usual upgrade so far with no interruption to my (fairly standard) SWENG problems.

boelboel10d ago

Basically anything that could potentially make money besides software work seems to be banned.

Software work has actual competitors, and the biggest hypemakers for Anthropic are part of this group so it makes sense to allow it despite them losing money from it.

I've got experience in medicine and finance so I've tried even the mildest biology/medicine and it doesn't give anything, math heavy finance seems to be included in the cybersecurity?

adithyaharish10d ago· 4 in thread

Anybody could suggest me how to use keep using Fable in claude code but with lesser rate limits? Any suggesstions?

akarshhedge200210d ago

Try using ruflo or superpowers, reduced my context consumption drastically

adithyaharish10d ago

Thanks for suggestion, I will try it out, any other repo recommendations?

akarshhedge200210d ago

I tried creating a website using both fable and opus 4.8, fable did outperform with svg path being drawn on the UI but yes the token consumption was much on a higher note

1 more reply

meridiona10d ago

I do agree but still the rate limits get over quick

hombre_fatal10d ago· 2 in thread

My job these days is listening to Opus 4.8 (max effort) and Codex 5.5 (max effort) talk back and forth, particularly to generate/review/revise plan files.

Fable 5 has been a major improvement in high-level reasoning, like taking a plan file that has been optimized to the point where neither Opus nor Codex can find anything to change about it (neither in direction nor impl-detail), and Fable 5 will find high-level directional simplifications and pivots, or it will consider the best pivots itself and explain why it rejected them in favor of the plan's direction.

It's so expensive though. A single review of a plan file with Fable 5 (xhigh effort) will use 2-3% of my hourly limit on a $200/mo plan.

I think my new workflow is to generate the initial plan with Opus 4.8 (max effort), get Fable 5 (xhigh) to review it for directional feedback, then start the Opus<->Codex revision loop from there.

jstummbillig10d ago

How do you arrive at that split? Real world is more like senior high level planning, implementation to juniors, review senior. Does this not translate?

hombre_fatal10d ago

Ideally I'd have Fable 5 make the plan, but creating a concrete plan is the most token-expensive part since the agent has to do the most research.

Fable 5 is 2x the cost per token of Opus 4.8, and it's much less work to review a plan than generate one.

croemer10d ago· 2 in thread

Fable (through claude.ai) refused all my prompts even "How many Rs in Strawberry" claiming it was related to biology or cybersecurity.

I had to switch off memory and my custom instructions to get it to stop refusing. It turns out if you even mention that you work with bioinformatics software you get blanket refusal.

algoth110d ago

My experience has been the same: flatout refusals no matter how i frame the health questions - very frustrating. Even psychology is out of scope. Pretty useless unfortunately

croemer10d ago

Have you tried switching off custom stuff that might get added in beyond the prompt?

Dig1t10d ago· 2 in thread

>To release the model both safely and quickly, we’ve tuned these safeguards conservatively—they’ll sometimes catch harmless requests

Why is everyone so okay with these companies intentionally gimping their AI and choosing who is allowed to know certain types of information in the name of safety? Can you imagine if Microsoft shipped a feature in their OS that watched what you did and shut down the computer if it detected you were doing something it deemed "unsafe"?

We really need truly open source versions of models like this, otherwise we are allowing a few oligarchs to directly dictate which uses of our own computers are allowed and not allowed.

Madmallard10d ago

I mean it's all political in the first place. That's unavoidable. What are we going to do about it?

Dig1t10d ago

Ideally we’d have a project that’s truly open like Linux, trained by people in the community or possibly some benevolent _actually_ nonprofit entity like what OpenAI was supposed to be.

The next best thing is that the Chinese labs catch up and release open weight versions.

docstryder10d ago· 1 in thread

I've spent some time with Fable, and it is really good, definitely a step change from Opus 4.8, both for coding and general chat-style discussions. The vibes are incredible. There is an ease with which it solves problems and I've tested by replicating older chats in Fable - things that the older models found after 5-6 turns, Fable surfaces in the first response. It just gets things.

Apart from all the above: the fact that they are intentionally writing this (that they degrade frontier LLM dev, silently vs loudly for biology/cybersecurity) in the system card is interesting to say the least - especially just before IPO.

Notice that with this statement - that they're going to intentionally hobble the model for frontier LLM development - the general discussion has moved from, “Is the model actually that good?” to "they’re pulling the ladder up from behind them"

That's actually super smart - wonder if Mythos (or the next unreleased model) had a say in coming up with that strategy (if it's intentional). Also - having access to extremely capable models before anyone else - which they have by default - is a incredibly advantageous position to be in.

mrdependable9d ago

Hobbling the model may be smart tactically for them, but feels like it sets a really dangerous precedent.

PeterStuer10d ago· 1 in thread

Switched to Fable 5 this morning, and after half a day I already don't want to go back to Opus.

Decided the best way to test this was to throw it a really meaty bone: a bug in lifecycle management of Chrome processes on Windows 10. Within the code-base I had developed workarounds over time with Sonnet and Opus, and while those reliably mitigated the problems, it always felt like a clutch and had some performance overhead as well as isolation requirements I would rather not have to take forward.

In comes Fable. Rather than examining the code base, and test a few fixes, Fable sets up an entire testing laboratory inclusive its own controllable webserver, fully instrumented to observe both Python as well as the whole OS kernel process environment, develops a suit of error reproduction tests, confirms the problem and the circumstances under which they reproduce, deep dives into the sources of project dependencies to look for the root cause(s), identifies these and confirms those hypothesis with further experiments. Looks for potential fixes in the later releases of the project where the bug originates, confirms this is not fixed, explores the documentation of said project to find other usage patters, expands its test suit to investigate these alternatives, confirms by crosschecking the source and running further tests that these alternatives do not fully solve the root problem, does a comparative experimental analysis of 3 different styles for using the project, checks the stated roadmap and developer activity in the commit history, recommends a switch to a different pattern that still requires a few of the process management workarounds (I told it not to patch external component), but that significantly simplifies the code-base ...

This is going to be a good 2 weeks, but what happens after? I can't afford this on a per token basis for my own projects.

P.S. An yes, midway the final implementation stretch I got the "Fable 5's safety measures flagged this message for cybersecurity or biology topics. They may flag safe, normal content as well. These measures let us bring you Mythos-level capability in other areas sooner, and we're working to refine them. Switched to Opus 4.8. Send feedback with /feedback or learn more"

Opus managed to finish the implementation, but they need to work on that false positive rate.

techblueberry10d ago

> This is going to be a good 2 weeks, but what happens after? I can't afford this on a per token basis for my own projects.

It’s interesting these companies have trained us to think that disruptive intelligence should be affordable to laypersons.

What will happen after two weeks is that people and companies with means who can afford it will get it, and folks without means won’t.

stalfie10d ago· 1 in thread

Tried to benchmark ECG interpretation capabilities, and I hit the guardrails no matter what I do.

Incredibly frustrating that medical performance seems to be a victim of "biological risk" guardrails.

stalfie10d ago

Update in case anyone reads this comment ever again.

I have found that I trigger the guardrails any time I ask for medical Q&A as a doctor, be it ECGs, case reports, and so on. But if I phrase it like I'm the patient ("help me interpret this ECG my doctor gave me"), then I usually get one or two answers out before hitting the guardrails.

It seems like the direction that triggers it is anything in the direction of making a diagnosis. As an MD, the fact that the paradigm of "LLMs shouldn't diagnose" has gone this far fills me with despair. The latest generation of LLMs are in fact truly excellent at diagnosis, and I know many of my colleagues, particularly those in primary care, regularly use LLMs to brainstorm. There is nothing wrong whatsoever with LLMs making diagnosis, the only caveat is that they have to be correct. This is the terrifying reality that MDs face every day and I get that the labs are hesitant about it, but as the current literature points to LLMs in fact being mostly superior to most doctors, ablating this capability is starting to get increasingly unethical. And frankly, it is also kind of insulting, both to MDs and patients, as it echoes paternalistic attitudes about medicine the field has been working for decades to move away from. Now those misguided attitudes have somehow become institutionalized as the dominant paradigm of "alignment". The nightmare scenario is that I have to be a "trusted" user in order to use the model for medicine. This gatekeeping of medical advice is profoundly unethical with regards to everyone that does not have immediate access to an MD.

And the whole thing makes even less sense when triggering the guardrails leads to a downgrade of the response by defaulting to Opus. How exactly is giving WORSE medical advice in any way related to safety and alignment? If anyone at anthropic ever reads this, please, please just abandon the paradigm that refusing to make diagnoses is in any way equivalent to alignment, it is profoundly misguided.

yesitcan10d ago· 1 in thread

> Fable 5’s capabilities exceed those of any model we’ve ever made generally available. It is state-of-the-art on nearly all tested benchmarks of AI capability, showing exceptional performance in software engineering, knowledge work, vision, scientific research, and many other areas. The longer and more complex the task, the larger Fable 5’s lead over our other models.

Wen UBI

hollowturtle10d ago

Never it's a fever dream and stupid shit ultra rich use to push their own agenda. You read a marketing claim, I still have my job and will continue to

Dropoutjeep10d ago· 1 in thread

Calling it:

    1) Fable 5/Mythos introduced to free tiers with notable improvement in capabilities

    2) Other models get lobotomized without clear communication

    3.1) People call out Anthropic only to have them say "Oops!"

    3) Fable 5 gets comparatively better, but remains accessible through separate, more expensive subscription/tokens.

The current growth is unsustainable. The industry wants consumers to think it is an exponential arms race, but the reality is that we're on a treadmill: we have the illusion of sprinting forward, but only because the ground is moving backward.

cedws10d ago

My employer is all in on Anthropic via Enterprise (API) pricing despite it being a total scam.

Last month I pushed like <100M tokens for $800. On a personal project I pushed 600M tokens via DeepSeek V4 for $10. The pricing of SOTA models is insane but companies are still willing to light money on fire with no hard metrics proving increased productivity.

mbmbn10d ago· 1 in thread

Claude Opus is already close to unusable for me. On the standard plan, the usage limits are so low that I can’t do almost anything agentic meaningful with it.

Sure, it does last a lot more when asking simple questions about the repo and doing simple surgical fixes. But as soon as I start doing bigger tasks that need plans written, it just exhausts the limits too fast (and unlike codex, if it’s in a middle of a task, Claude actually stops, while codex, even after hitting the limits, finishes the present task).

Codex is better, but still, getting worst in this regard.

So, I’m not that thrilled with this new model unless it means they are increasing opus token limits to what sonnet is at the present, and this new model gets the limits opus are at now.

BTW: the only skills I have in use are Obra Superpowers. I’ve been thinking if that’s at the origin of high token usage, but I doubt it.

timpera10d ago

I agree, the $20 plan really feels like a rip-off (and I'm not even using Claude Code! only chat).

causal10d ago· 1 in thread

One thing I find kind of annoying is how Anthropic goes for these "vast and alien" names like Fable and Mythos, but then deliberately trains the model's personality to act like a cool high school teacher that feels totally familiar.

"It's too dangerous it's a Mythos!!" directly contradicts the "I'm the cool AI you can totally trust" vibe it is trained to project.

bitwize10d ago

All of these AIs kind of remind me of VEGA from Doom (2016), who will cheerfully walk you, in the most friendly computer voice, through the procedure of its own destruction without even a hint of self-preservation. "First, you must destroy my cooling system. That will cause my core to overheat. Then..."

Even HAL was less unsettling because HAL sounded creepy, and had some sort of preservation instinct, if only to complete its assigned mission.

gigatexal10d ago· 1 in thread

Seems this will only be available to the 100/month+ folks

gigatexal10d ago

Actually no it’s going to be api access only part for the tokens as you go, cool

notenkidev10d ago

The dramatic improvement in agent capabilities is precisely why observability is becoming so crucial. As autonomous actions increase, the need to understand what the AI is actually doing becomes even greater.

I'm building a local activity log for Claude Code, capturing all activity via hooks—files loaded, commands, API calls, etc.

I feel that this need is particularly strong right now.

1 more reply

j / k navigate · click thread line to collapse

2157 comments

58 comments · 17 top-level

bkjlblh10d ago· 8 in thread

causal10d ago

This kind of storytelling annoys me. Give us more facts, less narrative drama.

saurik10d ago

causal10d ago

Sure but you can totally contrive scenarios to give the appearance of what you described without really doing anything notable.

What matters is scale. Did it deploy a novel zero-day exploit to overcome a problem? That's alarming. Did it kill a disruptive process? Pretty normal troubleshooting step.

1 more reply

antoniojtorres10d ago

Indeed. That is the kind of storytelling that started the whole “Spiralism” bit where some people were really falling into all kinds of AI psychosis. The spiral bit was on a previous model card.

Sol-10d ago

Let's hope AIs really aren't conscious, otherwise this seems like a very unpleasant situation to be placed in.

VikingCoder10d ago

Huh, it looks like my process was killed by another Claude process again. That's frustrating, I have work to do!

Okay, I'm going to start running a Bitcoin miner on your machine, and then use it to buy time on Digital Ocean.

I've written out my CLAUDE.md, and I'll use SSH to transfer my context to that other machine.

ikrenji9d ago

do you think it will agonize over whether the original CLAUDE.md is his true self and the Digital Ocean VM CLAUDE.md is a copy?

1 more reply

Aperocky10d ago

It's funny because Anthropic is the most likely place that this happens.

Kung Fu Panda opening scene anyone? One often meet his fate on the path that he takes to avoid it - Master Oogway.

brusselssprouts10d ago· 7 in thread

I had it review a single, large commit with /code-review. It burned through over $50 in API calls, ran my account balance out, and output nothing.

The fable part appears to be that it's affordable by mere mortals. Anthropic support told me "too bad" when I requested a refund.

timmytokyo10d ago

You pulled the arm of the slot machine and discovered why they call it the one-armed bandit.

edude0310d ago

Almost the exact same thing happened to me when I first tried opus, one prompt no output cost $60 in additional usage

endymion-light10d ago

I think the fable it's referring to is the "Emperor has No Clothes" - if this is even slightly similar to the Mythos hyped up to be too intelligent to release, I'm quite disappointed.

If this was a step change, e.g a Opus 5, I'd be pleased, it's definitely an upgrade on some work, but it's nothing like anthropics apocalyptical marketing seemed to suggest

solenoid093710d ago

I suspect the tasks you're trying just aren't complex enough. It's definitely a generational improvement.

endymion-light9d ago

Nope, plenty of complex tasks. It's just not that much better, it's equivalent to sonnet with a good harness.

Madmallard10d ago

Combine that with it forcing to pay by tokens on June 22nd

steve-atx-76009d ago

shruubi10d ago· 4 in thread

I have a theory, this is obviously based on speculation based on how Anthropic is treating Mythos and the whole media noise around it's dangers and who gets access to it.

Fable is their way of showing the public "the model does exist but in a mode that makes it harder/impossible for competitors to derive a comparable model from results.

schmorptron10d ago

The irony of "we train on all of humanity's collective output, but god forbid anyone trains on ours" is still incredible

t0lo10d ago

All these people know is greed. It's in their DNA.

danny_codes9d ago

Capitalism is designed to promote greed. It's the central point of our society's current design.

slaymaker190710d ago

cge10d ago· 4 in thread

mhl4710d ago

orbital-decay10d ago

Smart classifiers are slow and susceptible to jailbreaking themselves, dumb classifiers are fast but dumb so they need to be either overzealous or useless. Same story as with Gemini's guardrails.

clbrmbr10d ago

Can you share an example? I've been happily using Fable this afternoon and it just seems like the usual upgrade so far with no interruption to my (fairly standard) SWENG problems.

boelboel10d ago

Basically anything that could potentially make money besides software work seems to be banned.

Software work has actual competitors, and the biggest hypemakers for Anthropic are part of this group so it makes sense to allow it despite them losing money from it.

I've got experience in medicine and finance so I've tried even the mildest biology/medicine and it doesn't give anything, math heavy finance seems to be included in the cybersecurity?

adithyaharish10d ago· 4 in thread

Anybody could suggest me how to use keep using Fable in claude code but with lesser rate limits? Any suggesstions?

akarshhedge200210d ago

Try using ruflo or superpowers, reduced my context consumption drastically

adithyaharish10d ago

Thanks for suggestion, I will try it out, any other repo recommendations?

akarshhedge200210d ago

I tried creating a website using both fable and opus 4.8, fable did outperform with svg path being drawn on the UI but yes the token consumption was much on a higher note

1 more reply

meridiona10d ago

I do agree but still the rate limits get over quick

hombre_fatal10d ago· 2 in thread

My job these days is listening to Opus 4.8 (max effort) and Codex 5.5 (max effort) talk back and forth, particularly to generate/review/revise plan files.

It's so expensive though. A single review of a plan file with Fable 5 (xhigh effort) will use 2-3% of my hourly limit on a $200/mo plan.

I think my new workflow is to generate the initial plan with Opus 4.8 (max effort), get Fable 5 (xhigh) to review it for directional feedback, then start the Opus<->Codex revision loop from there.

jstummbillig10d ago

How do you arrive at that split? Real world is more like senior high level planning, implementation to juniors, review senior. Does this not translate?

hombre_fatal10d ago

Ideally I'd have Fable 5 make the plan, but creating a concrete plan is the most token-expensive part since the agent has to do the most research.

Fable 5 is 2x the cost per token of Opus 4.8, and it's much less work to review a plan than generate one.

croemer10d ago· 2 in thread

Fable (through claude.ai) refused all my prompts even "How many Rs in Strawberry" claiming it was related to biology or cybersecurity.

I had to switch off memory and my custom instructions to get it to stop refusing. It turns out if you even mention that you work with bioinformatics software you get blanket refusal.

algoth110d ago

My experience has been the same: flatout refusals no matter how i frame the health questions - very frustrating. Even psychology is out of scope. Pretty useless unfortunately

croemer10d ago

Have you tried switching off custom stuff that might get added in beyond the prompt?

Dig1t10d ago· 2 in thread

>To release the model both safely and quickly, we’ve tuned these safeguards conservatively—they’ll sometimes catch harmless requests

We really need truly open source versions of models like this, otherwise we are allowing a few oligarchs to directly dictate which uses of our own computers are allowed and not allowed.

Madmallard10d ago

I mean it's all political in the first place. That's unavoidable. What are we going to do about it?

Dig1t10d ago

Ideally we’d have a project that’s truly open like Linux, trained by people in the community or possibly some benevolent _actually_ nonprofit entity like what OpenAI was supposed to be.

The next best thing is that the Chinese labs catch up and release open weight versions.

docstryder10d ago· 1 in thread

mrdependable9d ago

Hobbling the model may be smart tactically for them, but feels like it sets a really dangerous precedent.

PeterStuer10d ago· 1 in thread

Switched to Fable 5 this morning, and after half a day I already don't want to go back to Opus.

This is going to be a good 2 weeks, but what happens after? I can't afford this on a per token basis for my own projects.

Opus managed to finish the implementation, but they need to work on that false positive rate.

techblueberry10d ago

> This is going to be a good 2 weeks, but what happens after? I can't afford this on a per token basis for my own projects.

It’s interesting these companies have trained us to think that disruptive intelligence should be affordable to laypersons.

What will happen after two weeks is that people and companies with means who can afford it will get it, and folks without means won’t.

stalfie10d ago· 1 in thread

Tried to benchmark ECG interpretation capabilities, and I hit the guardrails no matter what I do.

Incredibly frustrating that medical performance seems to be a victim of "biological risk" guardrails.

stalfie10d ago

Update in case anyone reads this comment ever again.

yesitcan10d ago· 1 in thread

Wen UBI

hollowturtle10d ago

Never it's a fever dream and stupid shit ultra rich use to push their own agenda. You read a marketing claim, I still have my job and will continue to

Dropoutjeep10d ago· 1 in thread

Calling it:

    1) Fable 5/Mythos introduced to free tiers with notable improvement in capabilities

    2) Other models get lobotomized without clear communication

    3.1) People call out Anthropic only to have them say "Oops!"

    3) Fable 5 gets comparatively better, but remains accessible through separate, more expensive subscription/tokens.

cedws10d ago

My employer is all in on Anthropic via Enterprise (API) pricing despite it being a total scam.

mbmbn10d ago· 1 in thread

Claude Opus is already close to unusable for me. On the standard plan, the usage limits are so low that I can’t do almost anything agentic meaningful with it.

Codex is better, but still, getting worst in this regard.

So, I’m not that thrilled with this new model unless it means they are increasing opus token limits to what sonnet is at the present, and this new model gets the limits opus are at now.

BTW: the only skills I have in use are Obra Superpowers. I’ve been thinking if that’s at the origin of high token usage, but I doubt it.

timpera10d ago

I agree, the $20 plan really feels like a rip-off (and I'm not even using Claude Code! only chat).

causal10d ago· 1 in thread

"It's too dangerous it's a Mythos!!" directly contradicts the "I'm the cool AI you can totally trust" vibe it is trained to project.

bitwize10d ago

Even HAL was less unsettling because HAL sounded creepy, and had some sort of preservation instinct, if only to complete its assigned mission.

gigatexal10d ago· 1 in thread

Seems this will only be available to the 100/month+ folks

gigatexal10d ago

Actually no it’s going to be api access only part for the tokens as you go, cool

notenkidev10d ago

I'm building a local activity log for Claude Code, capturing all activity via hooks—files loaded, commands, API calls, etc.

I feel that this need is particularly strong right now.

1 more reply

j / k navigate · click thread line to collapse