Source? (Even if rumor)
> Meta’s new foundational A.I. model, which the company has been working on for months, has fallen short of the performance of leading A.I. models from rivals like Google, OpenAI and Anthropic on internal tests for reasoning, coding and writing, said the people, who were not authorized to speak publicly about confidential matters.
> The model, code-named Avocado, outperformed Meta’s previous A.I. model and did better than Google’s Gemini 2.5 model from March, two of the people said. But it has not performed as strongly as Gemini 3.0 from November, they said.
> They added that the leaders of Meta’s A.I. division had instead discussed temporarily licensing Gemini to power the company’s A.I. products, though no decisions have been reached.
https://www.nytimes.com/2026/03/12/technology/meta-avocado-a...
OpenAI has the mindshare but they going to have to decide if they allocate their limited compute for free users or go all in trying to keep up with Anthropic in enterprise.
Maybe better phrasing is “HCI paradigm”, but that somehow manages to say everything and nothing.
(Of course at that point it involves memory and context management and so on, so you're testing the harness as well as the model.)
It doesn't though
People like to hate on Meta regardless of anything, and regardless of whether it's justified or not. Not saying it isn't, just that it's many people's default bias.
This problem will be solved shortly with better AI (if it hasn't essentially been solved already).
No more humans in the loop, much lower costs for social media manipulation. Welcome to the future!
I also had a poke around with the tools exposed on https://meta.ai/ - they're pretty cool, there's a Code Interpreter Python container thing now and they also have an image analysis tool called "container.visual_grounding" which is a lot of fun.
But GLM-5.1 has the best NORTH VIRGINIA OPOSSUM ON AN E-SCOOTER: https://simonwillison.net/2026/Apr/7/glm-51/
I guess I will have to wait. I hope at least soon it will be available on Openrouter. Overall, I am really excited to try it out.
So many different companies are going to have similarly powerful ai that there will be no moat around it and it will be cheap. They will never earn their investment back.
That said, there's nothing like the real thing.
The risk is something like the railroad bubble and the dotcom. Over-investement, circular revenue and a timeline that doesn't work.
Or, maybe it'll work out.
This as it turned out was not true for rail roads - more and more rail roads isnt a good thing.
The real dilemma facing the model producers is that all this money invested for a general model, targeting general intelligence, is a disaster and essentially the investment into existing assets is a write off. Then on top of that if this is true, youve got data centres full of compute that aren't being used up.
They find an arbitrary intelligence cutoff point between Opus and Mythos, label it "acceptable risk", and then the labs coordinate to gradually nudge that line forward and hope the internet doesn't break?
If they somehow do fail, then the output of that process will be fantastic open weight models (and hopefully some leaks). I want to say those will pay dividends for decades... but a better prediction is that they will be obsolete within three months ;)
Please take a moment to step outside the tech bubble. Neither my neighbor (a hair stylist) nor the carpenter fixing up her kitching cabinets are "using" AI. They might get Gemini text when googling something, though they often scroll past it because they often don't trust it. And they get lots of fake videos when scrolling their youtube which increasingly annoys them. The only times they are in touch with AI is when it's forced upon them, and otherwise they are living a pretty good life without any of this.
There is no objective evidence of anything you’ve said. It isn’t even clear if AI has contributed positively to global economic growth. It reminds me a lot of the late 90s and the dot-com mania. Slapping a domain on a commercial would make your stock go up even if there was no substance to any of it.
The real shame is this mania drowns out serious, practical use cases because when the bubble collapses, the market will throw the baby out with the bathwater.
You're in a bubble.
https://www.helpnetsecurity.com/2026/04/07/google-llm-conten...
And further down the line in chips, which is why Elon is building a fab now.
There are plenty of capable models on HuggingFace, yet I have no way of running them.
If the average user gets convinced they could run LLMs for cheap at home, you cannot trap users in your walled garden anymore.
I was saying this for years about Tesla’s FSD - they finally had to give in and drop the price to stay competitive.
spacex is engineering masterpiece with how they revolutionize the space industry.
At least he says he's doing that. It doesn't really make sense since you're not going to achieve an advanced node from a standing start in a practical time frame and cost.
Sounds like more Musk flavored vapor.
They already announced a partnership with Intel.
Major analytical errors in their response to multiple of my technical questions.
Otherwise you're doomed to "sample size of one" level of relevance.
>"Ask Meta AI..." placeholder.
>Colourful blue Send button.
>Eager to try, entering question... hitting Send.
>Log in or create an account to access.
>15 seconds of loading time
>Continue with Facebook or Instagram
Typical meta move, throwing a dark pattern at you from the beginning instead of just letting you try it
Won't even bother to continue, somehow OpenAI got this right.
I just posed the identical prompt/document to Muse Spark and it knocked it out of the park, extracted and displayed the pertinent pages from a multi-page PDF inline in the chat and rendered a correct answer.
This may be a one-off or lucky start but given the incredible result out of the gate I'm optimistic and will continue testing in parallel against other models before potentially making it my primary daily driver, excluding coding where the harnesses of claude code and codex are still needed (although hopefully they release something in this space too).
That being said Meta has the most adversarial data-usage policies I've seen among LLM providers so that's unfortunate for handling anything sensitive, but it also stands to reason that they have a long term advantage with such a massive proprietary data set. I'd prefer to also have a paid plan like the other services that allows me to keep my data out of training, rather than a free service and my usage being monetized in other ways.
While it's true, llama4 sucked, I still can't help feeling they have lost ground compared to where they would have been if they maintained that strategy. Due to llama, they were considered a peer with the other frontier model providers. Now they are not even in the conversation. It would take an incredible shift in performance to make me even consider using their new model. They may have a model, but the other providers have been busy building whole ecosystems around their tech which Meta has none of.
Maybe they could dump $1b into OpenCode or something and reignite the open ecosystem play with an open harness. They need something to get back in the conversation, if that's where they want to be. Otherwise, it will just be another closed, hidden proprietary AI model driving user facing Meta apps, but which nobody else cares about.
Meta hasn’t fully caught up, but they came close and I think can solidly claim to be a frontier lab again. I’d call it a 3.5 horse race right now, and hopefully their next model improves. More model competition is good!
Poor Grok 4.2 should probably be dropped from the table.
Unfortunately with LLMs everything is based off your use case, domain and the context you give it. I also use Grok daily for health questions as the other models are too afraid to give input on medical matters
Do they mean "the chain of thought is visible to the user" (ie. not hidden like ChatGPT), or "the medium of the chain of thought is not text, but visuals" (ie. thinking in images).
I'd guess the former, since it wouldn't be economical to generate transient images, just for thinking. But I'm not sure why they'd highight that in that case. If it were the second thing, that'd be extremely interesting. The first model not to think in text.
To my first approximation all "Chain of thought" means is that instead of having to prompt the model to discuss everything in text and then decide at the end[1], now it sort of automatically does that so you don't need to prompt it.
[1] Which used to bring about very substantial improvements in performance on some tasks
Finding a little bit tricky to evaluate because the harness is unfortunately very, very bad (e.g. search is awful). Can't wait to try this in some real external services where we can see how it performs for real.
Definitely getting ordinary high-quality results, overall. But hard to test agentic behavior and hard to test prose quality, even, when just working off of the default chat interface.
One thing that stands out is that _for_ the quality it feels very, very fast. Perhaps it's just only very lightly loaded right now, but irrespective it's lovely to feel.
I'm quite impressed with the tone overall. It definitely feels much more like Opus than it does, like, GPT or Grok in the sense that the style is conversational, natural and enjoyable.
How does one get their hands on these models? They are not open-source, right? I go to meta.ai, but it's just a chat interface---no equivalent to codex or claud code? Can you use this through OpenCode? Is meta charging for model access, or is the gathering of chat data a sufficiently large tithe?
from Facebook Newsroom: https://about.fb.com/news/2026/04/introducing-muse-spark-met...
Note: I'm expressing some skepticism here largely due to how recent rollouts from Meta flopped. Sincerely hoping that they do better this time around!
- Hacker News Guidelines https://news.ycombinator.com/newsguidelines.html
While working on a web-based graphics editor, I've noticed that users upload a lot of PNG assets with this problem. I've never tracked down the cause... is there a popular raster image editor which recently switched to dithered rendering of gradients?
...and so it's stuck, two decades on haha
The result for that specific image is: 500kb. 85% decrease in size
(But today is not that day.)
This article is about Meta, not about the user. Who signs off on these? Is the intended audience other people at Meta, not the user?
They want to 1) attract talent, 2) tell wall street they can play in this space as well, 3) help employees feel the company is moving in the right direction.
A frontier LLM doesn't apply to their core consumer products.
Not sure what this is now.
Well the original llama did kick off the era of open source LLMs. Most original open source LLMs were based on the llama architecture. And look where we are now OSS modles are very close to frontier.
It may not have benefitted Meta but it commoditizatised LLMs.
For those reading fast, this isn't a reference to SpaceX's Grok, this is Groq.com - with its custom inference chip, and offerings like https://groq.com/blog/introducing-llama-3-groq-tool-use-mode... and https://console.groq.com/landing/llama-api
You are right though. Meta could have been in lockstep releasing ChatGPT features into some chat bot on Facebook.com but instead it seemed like their FAIR arm was hell bent on commoditising this stuff by publishing their research models before the Chinese companies took the lead in that.
It’s hard for me to be mad at FAIR even though I general disagree with the outcomes that Meta produce for their users.
Especially, looking at these numbers after Claude Mythos, feels like either Anthropic has some secret sauce, or everyone else is dumber compared to the talent Anthropic has
I think it’s unrealistic to expect them to come back from that pit to the top in one year, but I wouldn’t rule them out getting there with more time. That’s a possible future. They have the money and Zuckerberg’s drive at the helm. It can go a long way.
If they actually matched Opus 4.6 on such a short timeline, it would have been mighty impressive. (Keep in mind this is a new lab and they are prohibited from doing distills.)
Meta's performance process is essentially "show good numbers or you're out." So guess what people do when they don't have good numbers? They fudge them. Happens all across the company.
Their whole "training the LLM to be a person" technique probably contributes to its pleasant conversational behavior, and making its refusals less annoying (GPT 5.2+ got obnoxiously aligned), and also a bit to its greater autonomy.
Overall they don't have any real moat, but they are more focused than their competition (and their marketing team is slaying).
For example, Claude has a "turn evil in response to reinforced reward hacking" behavior which is a fairly uniquely Claude thing (as far as I've seen anyhow), and very likely the result of that attempt to imbue personhood.
Might as well not release anything.
Yup, it's called test-time compute. Mythos is described as plenty slower than Opus, enough to seriously annoy users trying to use it for quick-feedback-loop agentic work. It is most properly compared with GPT Pro, Gemini DeepThink or this latest model's "Contemplating" mode. Otherwise you're just not comparing like for like.
Why can't others easily replicate it?
Many labs aren't able to keep up with the frontier, xAI, Mistral
Do we have data to substantiate that claim?
It's not just about LLMs, it's about being able to model consumers and markets and psychology and so on. Meta is also big in the manipulation side of things, any sort of cynical technological exploitation of humans you can imagine but that is technically legal, they're doing it for profit.
I can think of at least two reasons. Price and customizability. If they train their own models on their own data, they potentially have a better model at a better price, and they're not at the mercy of Anthropic's decisions when they decide to raise prices. Additionally, if you use someone else's model, you use it the way they create it and permit you to use it. In a couple years, who has any idea how these models are used. Arguably, a company the size of Meta should be in control of their AI models.
1) meta was doing this at scale before openAI
2) decent ML is critical to catagorising content at scale, the more accurate and fast the category, the finer the recommendations can be (ie instead of woman, outside as a tag for a video, woman, age, hair colour, location, subjects in view, main subject of video, video style) doing that as fast as possible with as little energy as possible is mission critical
3) The llama leak basically evaporated the moat around openAI who _could_ have become a competitor
4) for the AR stuff, all of these models (and visual models) are required to make the platform work. They also need complete ownership so that it can be distilled to make it run on tiny hardware
5) dick swinging
6) they genuinely want to become a industrial behemoth, so robots, hardware, etc are now all in scope.
Secondly though, I think it has to do with the fact Meta is big enough to worry about vertical integration and full control of their business.
The whole reason they've been trying to make AR/VR happen for over a decade now is the assumption of a worst case and best case scenario. The worst case is Apple and Google wants them gone. This isn't as far fetched as it seems, Google has historically been Meta's biggest competitor and even tried to release its own social network back when Meta was threatening them. If either pulls Meta apps from their respective stores, it'd be an immense blow to Meta; their whole trillion-dollar business depends on competitor's platforms.
Meta tried making inroads into the phone business but failed; it is a very crowded market after all. So they changed their strategy. Instead of playing catch-up, they'd invent "the next iPhone" and be the first to a brand new market. This is the best case scenario; they invent a new platform where they can be dominant from day 1 and stop depending on competitor's hardware, not only removing that risk factor for them, but also unlocking a new market they can control.
AI ties into all this because it appears to be key for this next platform to happen. You will communicate with these smart glasses via voice, hand gestures, or subtle movements that a model will have to interpret. The features that could make them stand out as more than just a screen on your face are all AI related; object detection, world understanding, context awareness, etc. If all this were done via a 3rd party Meta would effectively be back on square one: a competitor could easily yank away its model access, or sell it to a competitor. Meta would be again at the mercy of others.
Compared to other big-tech players, I think it's easy to see how Meta is in a riskier position. There's little Google or Microsoft can do to kill the iPhone. There's little Apple or Google can do to kill Amazon's online store. There's little Amazon or Apple can do to kill Microsoft's business deals. Google and Meta are primarily in the business of capturing people's data, attention, and selling ads, and both Google and Apple could do quite some damage to Meta. Beyond expanding it, it's important for them to invest in ways to protect their money-printing machine.
Or any quality control (people missing posts)
Or banning the people who should be banned while leaving everyone else alone
This is Zuck: https://news.ycombinator.com/item?id=4151433 or https://news.ycombinator.com/item?id=10791198
But he has to do it anyways, otherwise Meta can be disrupted easily.
Google, Apple has hardware, distribution channels for their products
Amazon has the marketplace and cloud
Microsoft has enterprise and cloud
Meta is always looking for ways to stay afloat
If spark beats opus 4.6, why is meta wasting money on opus internally?
Just a speculation, I have no real knowledge about it.
What could have been interesting has been reduced to simply another subpar LLM release.
"this is step one. bigger models are already in development with infrastructure scaling to match. private api preview open to select partners today, with plans to open-source future versions. incredibly proud of the MSL team. excited for what’s to come!"
The goal of public companies is generally to generate profit for their investors.
Love to see it. Cheers!
I Googled it and found absolutely nothing.
Well, to be honest, I got 100% of websites containing the French word "boîtier" (box) with a typo.
Even on Google Scholar, the closest match is "BioTiER (Biological Training in Education and Research) Scholars Program", which is at least 10 years old and has nothing to do with that.
Is that an AI-generated image with an AI-generated name that has no physical existence?
I suspect it is because they also refactored Meta AI entirely to use Next.js instead of their normal stack they use for literally everything else. Not sure why they would do this, but I guess it works (...or maybe not) for them.
If Meta wants to be seen as a cutting edge massive lab they need to come across as one instead of looking like a school project version of a frontier model.
(I'm not using it as I'm not agreeing to their ad terms).
I’m trying to decide is I find the doublespeak a bit offensive or not.
It nailed all the ChatGPT meme gotchas (walk to the carwash, Alice 50 brothers, upside down cup, R's in strawberry, which number is bigger, 9.11 or 9.9?)
I guess all that money poaching OpenAI / Anthropic talent went somewhere...
Now, would I use "Meta Muse Code" or "Muse CoWork" if I have to have a facebook account to all of my developers? Maybe not.
Would I use it via an API key? I might, depends on the pricing!
We spent time yesterday arguing through an architecture decision. Today I ask the Agent to help implement it - it knows nothing about any of that. You’re effectively starting over.
Feels like the real problem isn’t intelligence, it’s continuity. And most benchmarks don’t even touch that.
I tried multiple riddles, graphs and questions I know some LLMs fails at, but this one seems to do well. But I still don't have much trust in Meta after the scandal of them fiddling with their previous models to look good.
The same is true with any other model, unless otherwise stated.
In the next few days, we'll see who Meta has paid to promote this model on social media.
Also, I think people aren't used that using such models requires meta.ai or meta ai app.
The impressive part is multimodality, very plausible since there's less focus there by other labs (especially Anthropic)
> Think longer to solve harder problems > Compress > Think longer again
I don't like that I need to login to my FB/Instagram account to access this.
I doubt its better than Opus and even if it was its not worth the privacy concerns.
So does cooperation in any framework that values public good over pure obedience to an inherently-abusive late stage capitalism. I know that's passé in a world where the US government no longer believes in funding science, and yet.
Competition is also inherently wasteful. And if you're talking about wasting a few K or a few Mil here or there, fine, whatever. But here we're talking about waste on the order of trillions of dollars at the end of the day.
Not my loss, will keep using DeepSeek then. Wake me up when my country is no longer in the wrong/right side of history.
Edit: nvm I can't read, regular benchmarks against SOTA are there
Besides, I'm old enough to recall that META has trained a version of LLAMA 4 specifically for LM arena elo benchmaxxing and PR things, and proceeded to release a different version of LLAMA 4.
Maybe they need to mine more libra coin first? or is it diem now? is that even still part of meta?
I'm sure this new AI is super intelligent and super awesome and will be writing all the code, making all the blog posts, and generating all our youtube shorts in 6 months.
yeah, the metaverse got abandoned. Also: Meta was the only one to try the concept for the past X-umpteen years even though everyone in the industry ga-gas over virtual reality worlds and workplaces at every opportunity. It's literally Meta and Linden Labs (which has been on life support for 10+ years.)
The alternative is : no one does it and nothing gets abandoned, which the industry has shown itself to be exceedingly good at w.r.t VR for the past 40+ years.
To be clear: I have no faith in meta as a company; my problem lies in kicking an entity because they attempted something different.. I don't think that's productive, and it produces stuff like the past AI winters because groups get afraid of touching experimental concepts ever again lest they incur the wrath of the shareholder.
We keep seeing things being overhyped, with not much thought behind it. Meta is particularly bad about it. They changed their name for the hype of their VR product, when VR was still niche and had a long way to go, and still does. They couldn't even figure out legs for launch.
Now they have a 'superintellegence'? Yeah, that sounds like just the latest in a line of bullshit. Why would this be different.