Muse Spark: Scaling towards personal superintelligence (opens in new tab)

(ai.meta.com)

393 pointschabons1mo ago367 comments

367 comments

I don't get the comments trashing this. If it slightly beats or even matches Opus 4.6, it means Meta is capable of building a model competitive with the leading AI company. Sure, they spent a lot of money and will have on-going costs. But how much more work would it take to turn that into a coding agent people are willing to try (and pay for) along side their usage of a collection of agents (Claude, Codex, etc)? Also means Meta doesn't have to pay another company to use a SATA model across all their products (including IG and WhatsApp, vr) which will matter to their balance sheet long term (despite the constant r&d spend).

prodigycorp1mo ago

Comments trashing this are rightly correct skeptics who remember the benchmaxxing of llama 4. This model was out in the woods as early as like a couple months ago but they didn't release it because it was at gemini 2.5 pro levels.

refulgentis1mo ago

> 4. This model was out in the woods as early as like a couple months ago but they didn't release it because it was at gemini 2.5 pro levels.

Source? (Even if rumor)

nl1mo ago

NYTimes had a story about this (March 12):

> Meta’s new foundational A.I. model, which the company has been working on for months, has fallen short of the performance of leading A.I. models from rivals like Google, OpenAI and Anthropic on internal tests for reasoning, coding and writing, said the people, who were not authorized to speak publicly about confidential matters.

> The model, code-named Avocado, outperformed Meta’s previous A.I. model and did better than Google’s Gemini 2.5 model from March, two of the people said. But it has not performed as strongly as Gemini 3.0 from November, they said.

> They added that the leaders of Meta’s A.I. division had instead discussed temporarily licensing Gemini to power the company’s A.I. products, though no decisions have been reached.

https://www.nytimes.com/2026/03/12/technology/meta-avocado-a...

https://archive.is/uUV5h#selection-715.98-715.277

1 more reply

prodigycorp1mo ago

It was from a techmeme ride home podcast where the host discussed "sources at the company said". I don't remember which day's episode it was.

zozbot2341mo ago

The llama4 series was one of the earliest large MoE's to be made publically available. People just ignored it because they were focused on running smaller and denser models at the time, we should know better these days.

dilap1mo ago

Deepseek R1 was a publically-available, MoE model that was getting a ton of attention before llama4. Llama4 didn't get much attention because it wasn't good.

1 more reply

prodigycorp1mo ago

the models were objectively horrible

1 more reply

canes1234561mo ago

Why go into coding agents? Both anthropic and OpenAI are going all in on that. The opportunity is customer facing AI now.

OpenAI has the mindshare but they going to have to decide if they allocate their limited compute for free users or go all in trying to keep up with Anthropic in enterprise.

kaycey20221mo ago

you can do way more than just coding with the coding agents.

foobiekr1mo ago

Because coding agents are where the revenue is.

refulgentis1mo ago

If you squint at coding agents you see the next OS.

Maybe better phrasing is “HCI paradigm”, but that somehow manages to say everything and nothing.

2 more replies

modeless1mo ago

It's a decent model if the benchmarks are to be believed, but it won't be close to Opus in usefulness for programming. None of these benchmarks completely capture what makes a model useful for day-to-day coding tasks, unfortunately. It will take time for them to catch up, and Opus will keep improving in the meantime. But it's good to have more competition.

ai5iq1mo ago

Benchmarks miss the thing that actually matters for agentic use: how does behavior change over a multi-day horizon? A model that scores well on one-shot coding tasks can still make terrible decisions when it has persistent state and resource constraints. That's where you see the real gaps between models.

andai1mo ago

Is there a benchmark for these long tasks? That kind of seems like the only number worth measuring.

(Of course at that point it involves memory and context management and so on, so you're testing the harness as well as the model.)

1 more reply

redox991mo ago

> If it slightly beats or even matches Opus 4.6

It doesn't though

ryeguy_241mo ago

Curious on why you think this. Any data points that led you to this?

howdareme1mo ago

The benchmarks they released

1 more reply

ChipopLeMoral1mo ago

> I don't get the comments trashing this.

People like to hate on Meta regardless of anything, and regardless of whether it's justified or not. Not saying it isn't, just that it's many people's default bias.

jatora1mo ago

That is not the case here. Nobody hated on llama 1,2,3 at all. They justifiably felt burned by the benchmaxxing of llama 4. Trust broken must be re-earned, and benchmarks alone cannot do that.

blazespin1mo ago

Because bots and trillion dollar ipos and even bigger stakes. People need to better appreciate the level of manipulation going on. Social media has an outsized impact. Bots and even people are getting paid to post and upvote/downvote narratives.

asdfman1231mo ago

> people are getting paid to post and upvote/downvote narratives

This problem will be solved shortly with better AI (if it hasn't essentially been solved already).

No more humans in the loop, much lower costs for social media manipulation. Welcome to the future!

simonw1mo ago

Pelicans: https://simonwillison.net/2026/Apr/8/muse-spark/

I also had a poke around with the tools exposed on https://meta.ai/ - they're pretty cool, there's a Code Interpreter Python container thing now and they also have an image analysis tool called "container.visual_grounding" which is a lot of fun.

wsgeorge1mo ago

Alexandr Wang suggesting this might be open-weights/source in the future gives me hope. Hopefully they stay on this path.

lemonish971mo ago

I have a feeling it won't be this exact model, but rather smaller distilled variants, similar to the gemma line

sbinnee1mo ago

It is fair to think so because that is what everyone is doing. But being Meta and considering Llama, if MSL is going to keep releasing models and wants to join back the AI war, they may actually open weights just to get more attention. Once they establish a sizable community, they can start guarding their frontier models.

sunaookami1mo ago

Seems like not all tools are available everywhere? Don't have access to visual_grounding sadly, only these: https://embed.fbsbx.com/playables/view/4208761039384112/?ext...

simonw1mo ago

Interesting, you got some I didn't: animate image, create video and get reference audio.

nickvec1mo ago

The only benchmark I care about! Just curious Simon - which model do you think has created the best pelican riding a bicycle thus far?

simonw1mo ago

Gemini 3.1 Pro: https://simonwillison.net/2026/Feb/19/gemini-31-pro/

But GLM-5.1 has the best NORTH VIRGINIA OPOSSUM ON AN E-SCOOTER: https://simonwillison.net/2026/Apr/7/glm-51/

sbinnee1mo ago

> but you can try it out today on meta.ai (Facebook or Instagram login required).

I guess I will have to wait. I hope at least soon it will be available on Openrouter. Overall, I am really excited to try it out.

daft_pink1mo ago

This really reinforces the idea that the AI race and the Railroad Mania of the 19th century are very similar.

So many different companies are going to have similarly powerful ai that there will be no moat around it and it will be cheap. They will never earn their investment back.

cheriot1mo ago

I suspect this is the real reason behind Anthropic limiting subscriptions to their own products and keeping API prices several times higher than comparable models. Applications more sticky than API users and less technical users more sticky than programmers (ie Cowork more sticky than Code).

netcan1mo ago

Anthropic generally seem more into living within market discipline and market signals of some sort. Products with margins, even if it's sort of irrelevant considering R&D costs and capital inflow.

That said, there's nothing like the real thing.

The risk is something like the railroad bubble and the dotcom. Over-investement, circular revenue and a timeline that doesn't work.

Or, maybe it'll work out.

c3fxx1mo ago

The whole premise is based on the fact that over-investing in GPUs and models are a good thing here as it yields more 'intelligence'.

This as it turned out was not true for rail roads - more and more rail roads isnt a good thing.

The real dilemma facing the model producers is that all this money invested for a general model, targeting general intelligence, is a disaster and essentially the investment into existing assets is a write off. Then on top of that if this is true, youve got data centres full of compute that aren't being used up.

andai1mo ago

The weird position they find themselves in now is that they have to keep making it smarter... but they already made it too smart (Mythos). I'm not sure how that's going to work out exactly.

They find an arbitrary intelligence cutoff point between Opus and Mythos, label it "acceptable risk", and then the labs coordinate to gradually nudge that line forward and hope the internet doesn't break?

2 more replies

throwaway1737381mo ago

Maybe they’ll figure out how to make an agent train an agent.

2 more replies

andai1mo ago

Well all of them are already in bed with the government, so they're going to find themselves with slightly more assistance than a free market would predict.

If they somehow do fail, then the output of that process will be fantastic open weight models (and hopefully some leaks). I want to say those will pay dividends for decades... but a better prediction is that they will be obsolete within three months ;)

holoduke1mo ago

Nah. Everybody is talking about ai. Everybody is using it. It's by far the most popular new tool human beings are using currently. As popular as mobile phones or spoons. And maybe as disruptive as the steam engines. AI companies are becoming the largest software companies on the planet. Everything points into that direction. Trillions of dollars are waiting in the market to be collected.

christina971mo ago

Right, but the question is whether the companies producing foundation models will capture that value or not. Right now it seems like tokens might end up just being a commodity sold at cost plus, and companies higher up in the supply chain will make the money. Electricity changed the world but electricity companies capture very little of that value.

vidarh1mo ago

I'm betting on it. I'm working on a project right now where I'm prototyping everything with Claude, until I hit my limits on my MAX subscription for the week. Then I switch to Codex, and start by ironing out harness differences. When I max out that, I switch to a mix of GLM 5.1, Qwen 3.6m, Kimi K2.5 and Deepseek and spend part of the time ironing out issues with them while they work on other parts of the project. Every iteration, the harness gets hardened and the pain of switching to the cheaper/dumber models reduce for the next cycle. The gap reduces each time, and with each new upgrade of the open models. Everything points to the cost/value intersecting in not too long.

teiferer1mo ago

> Everybody is talking about ai. Everybody is using it.

Please take a moment to step outside the tech bubble. Neither my neighbor (a hair stylist) nor the carpenter fixing up her kitching cabinets are "using" AI. They might get Gemini text when googling something, though they often scroll past it because they often don't trust it. And they get lots of fake videos when scrolling their youtube which increasingly annoys them. The only times they are in touch with AI is when it's forced upon them, and otherwise they are living a pretty good life without any of this.

1 more reply

Eufrat1mo ago

Based on what? A lot of this is vibes and FOMO; just like any economic bubble.

There is no objective evidence of anything you’ve said. It isn’t even clear if AI has contributed positively to global economic growth. It reminds me a lot of the late 90s and the dot-com mania. Slapping a domain on a commercial would make your stock go up even if there was no substance to any of it.

The real shame is this mania drowns out serious, practical use cases because when the bubble collapses, the market will throw the baby out with the bathwater.

olelele1mo ago

You can do anything at zombo.com!

atleastoptimal1mo ago

How can you look at Anthropic's revenue chart and claim it's just vibes

1 more reply

naasking1mo ago

> Based on what? A lot of this is vibes and FOMO; just like any economic bubble.

You're in a bubble.

https://www.helpnetsecurity.com/2026/04/07/google-llm-conten...

dist-epoch1mo ago

The moat is in the compute and the energy access.

And further down the line in chips, which is why Elon is building a fab now.

There are plenty of capable models on HuggingFace, yet I have no way of running them.

khalic1mo ago

Give it a few years, or month. Tiny models are getting outrageously good

spprashant1mo ago

I wonder if this is why the tech cartel is buying up all the hardware?

If the average user gets convinced they could run LLMs for cheap at home, you cannot trap users in your walled garden anymore.

1 more reply

mobattah1mo ago

Exactly. We’ll see the cost of AI continue to drop.

I was saying this for years about Tesla’s FSD - they finally had to give in and drop the price to stay competitive.

1 more reply

cedws1mo ago

That fab will never be delivered. In five years you might see the manufacturing equivalent of a person dancing in spandex.

QQ001mo ago

The only company that musk own and actually achieve something is spacex. so I believe you. He likes to hype things beyond what is actually possible.

spacex is engineering masterpiece with how they revolutionize the space industry.

1 more reply

nutjob21mo ago

> which is why Elon is building a fab now

At least he says he's doing that. It doesn't really make sense since you're not going to achieve an advanced node from a standing start in a practical time frame and cost.

Sounds like more Musk flavored vapor.

re-thc1mo ago

> It doesn't really make sense since you're not going to achieve an advanced node from a standing start in a practical time frame and cost.

They already announced a partnership with Intel.

2 more replies

mirekrusin1mo ago

What people seem to miss is that they don't need to get the investment back from people, they will get it from machines.

AnimalMuppet1mo ago

Could you explain how you think that's going to work? Because to me it seems that until machines have bank accounts, there's no money for them to get.

mirekrusin1mo ago

People make mistake of thinking that their only way of making money is directly selling tokens. They miss the fact that if you have AGI it’s better to keep tokens to yourself and sell final results instead. When we all loose jobs it’s not going to be to somebody using their tokens, it’s going to be to them selling final products. Selling tokens will be to them like selling books by Amazon, their revenue will be dominated by self branded services and products that doesn’t require exposing AGI internals directly. Tokens API will always be nerfed.

ValentineC1mo ago

Not the parent, but I guess that if AGI happened and was competent enough to trade markets, they'd earn the company back their investment in a short period.

1 more reply

girvo1mo ago

People are the one with money, though, at the end of the day.

jatora1mo ago

you dont need money if you have resources and manpower

2 more replies

_2d301mo ago

Ran some of my internal benchmarks against this and I'm very unimpressed. I don't think this moves them into the OAI v Anthropic v Gemini conversation at all.

Major analytical errors in their response to multiple of my technical questions.

_2d301mo ago

Playing with this some more and it's actively not good. Just basic mathematical errors riddling responses. Did some basic adversarial testing where its responses are analyzed by Gemini and Gemini is finding basic math errors across every relatively (relative to Opus, Gemini or GPT can handle) simple ask I make. Yikes.

smlacy1mo ago

Post actual results, make a blog post. Don't just say "this sucks" without tangible evidence.

Otherwise you're doomed to "sample size of one" level of relevance.

thorum1mo ago

I have the opposite experience: random HN/Reddit comments saying “this sucks” or “whoa this is a huge improvement” are the only benchmark that means anything. Standard benchmarks are all gamed and don’t capture the complexity of the real world.

titanomachy1mo ago

Then your internal benchmarks will be in the post-training set and you’ll have to make new ones.

_2d301mo ago

I may already have but I'm pseudonymous on this website.

1 more reply

mliker1mo ago

It’s quite good for multimodal cases that 3 billion people would use it for though it lags in scientific areas

_2d301mo ago

Yes, this would make sense for what Meta might focus on.

jatora1mo ago

even gemini is not in that conversation

gloosx1mo ago

>Text field.

>"Ask Meta AI..." placeholder.

>Colourful blue Send button.

>Eager to try, entering question... hitting Send.

>Log in or create an account to access.

>15 seconds of loading time

>Continue with Facebook or Instagram

Typical meta move, throwing a dark pattern at you from the beginning instead of just letting you try it

Won't even bother to continue, somehow OpenAI got this right.

laser1mo ago

First thing I tried is a visual reasoning test on floor plan documents that applies directly to something I'm working on and needed that I posed to ChatGPT, Claude, Gemini, and Grok yesterday (lowest tier paid plans on each). In that test only Gemini succeeded while the other models hallucinated/incorrectly reported the relative location of building units.

I just posed the identical prompt/document to Muse Spark and it knocked it out of the park, extracted and displayed the pertinent pages from a multi-page PDF inline in the chat and rendered a correct answer.

This may be a one-off or lucky start but given the incredible result out of the gate I'm optimistic and will continue testing in parallel against other models before potentially making it my primary daily driver, excluding coding where the harnesses of claude code and codex are still needed (although hopefully they release something in this space too).

That being said Meta has the most adversarial data-usage policies I've seen among LLM providers so that's unfortunate for handling anything sensitive, but it also stands to reason that they have a long term advantage with such a massive proprietary data set. I'd prefer to also have a paid plan like the other services that allows me to keep my data out of training, rather than a free service and my usage being monetized in other ways.

zmmmmm1mo ago

The real question for me, if we assume they once again have a competitive frontier model, is what this means for Meta's strategy now. In particular, have they abandoned all their philosophy of the open ecosystem / open model play they were pursuing before?

While it's true, llama4 sucked, I still can't help feeling they have lost ground compared to where they would have been if they maintained that strategy. Due to llama, they were considered a peer with the other frontier model providers. Now they are not even in the conversation. It would take an incredible shift in performance to make me even consider using their new model. They may have a model, but the other providers have been busy building whole ecosystems around their tech which Meta has none of.

Maybe they could dump $1b into OpenCode or something and reignite the open ecosystem play with an open harness. They need something to get back in the conversation, if that's where they want to be. Otherwise, it will just be another closed, hidden proprietary AI model driving user facing Meta apps, but which nobody else cares about.

jatora1mo ago

no need for an open harness when anthropic so kindly gifted the community theirs :)

granzymes1mo ago

Comes impressively close to GPT 5.4 / Gemini 3.1 Pro / Opus 4.6! Mostly behind OpenAI on coding/agentic benchmarks, behind Google on text reasoning, behind Anthropic on Humanity's Last Exam with tools (surprisingly the only benchmark where Anthropic leads currently).

Meta hasn’t fully caught up, but they came close and I think can solidly claim to be a frontier lab again. I’d call it a 3.5 horse race right now, and hopefully their next model improves. More model competition is good!

Poor Grok 4.2 should probably be dropped from the table.

fancy_pantser1mo ago

It's looking rather low on reasoning and long-range problems with the approach described. For example, even with 16 agents and compaction, the HLE score is significantly below Anthropic's Mythos. Like you, I can see the release as a net Good Thing, but apples-to-apples for each org's latest models do have Meta holding steady in the middle pack.

zozbot2341mo ago

HLE encompasses very hard problems where the larger pretraining of Mythos probably matters quite a bit. I'm not saying that Mythos is not showing some amount of genuine improvement compared to e.g. the latest Opus; just that if you're going to compare models, you should at least make sure that the overall test-time workload is in the same ballpark given how high it seems to be for Mythos.

deanc1mo ago

Grok code was my daily driver for months while it was free and it was fantastic - it is certainly no worse than it was a few months ago.

Unfortunately with LLMs everything is based off your use case, domain and the context you give it. I also use Grok daily for health questions as the other models are too afraid to give input on medical matters

flufluflufluffy1mo ago

Why do you need to ask any AI questions regarding your health every day?

1 more reply

glerk1mo ago

Personal as in Meta gets your personal data so they can sell you more ads.

flufluflufluffy1mo ago

And slowly siphon your personal essence away from yourself and into the model.

CrzyLngPwd1mo ago

If I'm a claw, then they can send me as many ads as they like.

TobTobXX1mo ago

> Muse Spark is a natively multimodal reasoning model with support for [...] visual chain of thought [...].

Do they mean "the chain of thought is visible to the user" (ie. not hidden like ChatGPT), or "the medium of the chain of thought is not text, but visuals" (ie. thinking in images).

I'd guess the former, since it wouldn't be economical to generate transient images, just for thinking. But I'm not sure why they'd highight that in that case. If it were the second thing, that'd be extremely interesting. The first model not to think in text.

fc417fc8021mo ago

Perhaps more importantly, will their chain of thought be "real"? So far the ones I've seen seem to be elaborate fakery. They look good unless you dig in at which point you often find that it merely looks plausible on the surface but that something else is going on under the hood.

seanhunter1mo ago

I don't know what you mean by that. We know what's going on under the hood always: linear algebra, the attention mechanism etc.

To my first approximation all "Chain of thought" means is that instead of having to prompt the model to discuss everything in text and then decide at the end[1], now it sort of automatically does that so you don't need to prompt it.

[1] Which used to bring about very substantial improvements in performance on some tasks

1 more reply

rain-princess1mo ago

Actually I believe that behavior shows up in Gemini chats (if you are doing a visual task) it will generate intermediate diagrams and research papers have created approaches to that effect (generating turtle diagrams) since 2024

tekacs1mo ago

https://meta.ai/share/pe4HxOfv2Bp

Finding a little bit tricky to evaluate because the harness is unfortunately very, very bad (e.g. search is awful). Can't wait to try this in some real external services where we can see how it performs for real.

Definitely getting ordinary high-quality results, overall. But hard to test agentic behavior and hard to test prose quality, even, when just working off of the default chat interface.

One thing that stands out is that _for_ the quality it feels very, very fast. Perhaps it's just only very lightly loaded right now, but irrespective it's lovely to feel.

I'm quite impressed with the tone overall. It definitely feels much more like Opus than it does, like, GPT or Grok in the sense that the style is conversational, natural and enjoyable.

nl1mo ago

This seems pretty good.

moab1mo ago

"Muse Spark is available now, and Contemplating mode will be rolling out gradually in meta.ai."

How does one get their hands on these models? They are not open-source, right? I go to meta.ai, but it's just a chat interface---no equivalent to codex or claud code? Can you use this through OpenCode? Is meta charging for model access, or is the gathering of chat data a sufficiently large tithe?

meetpateltech1mo ago

"It will be available in private preview via API to select partners, and we hope to open-source future versions of the model."

from Facebook Newsroom: https://about.fb.com/news/2026/04/introducing-muse-spark-met...

tempaccount4201mo ago

I can't think of any "select partners" that would want to use this non-SOTA model. Just put it on OpenRouter.

giancarlostoro1mo ago

If Microsoft is a select partner, maybe they could shove it into Copilot for VS or something, but yeah, I'm wondering the same, maybe Apple could be one of their partners too?

pstuart1mo ago

I appreciate that they build this stuff for their own benefit, but I don't want to feed even more of my private info. Hopefully the models will become public or lead to equivalent models from other sources.

mark_l_watson1mo ago

That would be my question also. I like it when companies have easy to sign up for, pay as you go models. Being able to buy $5 worth of tokens and get an API key - in less than a few minutes - is ideal.

monkeydust1mo ago

TBD it seems. So far the only explained usage pattern is through a Meta product (Whatsapp, Facebook, Instagram).

moab1mo ago

So to verify their claims and see how strong these models are, the answer is "believe us"?

Note: I'm expressing some skepticism here largely due to how recent rollouts from Meta flopped. Sincerely hoping that they do better this time around!

nemomarx1mo ago

I assume the answer is try it out in the chat mode? You could run your usual benches through that right

hackrmn1mo ago

The hero image on the linked page, which consists of a muted teal background with the words "Introducing Muse Spark", weighs in at 3,5MB. I don't even...

KerrickStaley1mo ago

"Please don't complain about tangential annoyances—e.g. article or website formats, name collisions, or back-button breakage. They're too common to be interesting."

- Hacker News Guidelines https://news.ycombinator.com/newsguidelines.html

gobdovan1mo ago

It's at least Meta-relevant. Compression Represents Intelligence Linearly (Y Huang, 2024)

sumedh1mo ago

Such complaints are valid for AI model releases, that tells us that they are not using their own models to test their own release pages.

ValentineC1mo ago

Maybe they did get their models to test their pages, but they didn't tell their models to pretend that they're browsing on mobile using a 3G connection.

yawnxyz1mo ago

I think this speaks to the product release iself

fleabitdev1mo ago

Good catch - looks like it's a PNG image, with an alpha channel for the rounded corners, and a subtle gradient in the background. The gradient is rendered with dithering, to prevent colour banding. The dither pattern is random, which introduces lots of noise. Since noise can't be losslessly compressed, the PNG is an enormous 6.2 bits per pixel.

While working on a web-based graphics editor, I've noticed that users upload a lot of PNG assets with this problem. I've never tracked down the cause... is there a popular raster image editor which recently switched to dithered rendering of gradients?

girvo1mo ago

My reasoning is because once upon a time, I was using Macromedia Fireworks, and PNGs gave far far better results than JPGs did at the time, at least in terms of output quality. Nearly certainly because I didn't understand JPG compression, but for web work in the mid 2000s PNGs became my favourite. Not to mention proper alpha channels!

...and so it's stuck, two decades on haha

Overpower04161mo ago

lol it literally took me 2s to google search "optimize image for website" and 10s to upload and get a smaller sized image.

The result for that specific image is: 500kb. 85% decrease in size

BugsJustFindMe1mo ago

An indistinguishable JPG is 170KB. An SVG would be 20KB.

levocardia1mo ago

CSS with a linear gradient background would be even smaller :)

sofixa1mo ago

You can even automatically do that on your CDN/delivery/web server layer. Or as part of your web deployment pipeline.

Overpower04161mo ago

Yes, but it might be a little too advance for Meta ;)

1 more reply

hungryhobbit1mo ago

Someday our robot overlords will be intelligent enough to ... optimize images!

(But today is not that day.)

ruszki1mo ago

The proper optimization in this case is to not use images at all.

kzrdude1mo ago

For me it's 213 kB. Did they replace it?

1 more reply

zfol_5101mo ago

And it doesn't even look high-res.

Invictus01mo ago

complaining about sand on the beach

fooqux1mo ago

It's not sand on the beach, it's garbage on the beach.

hackrmn1mo ago

I am simply offended. By Meta's lack of sensibilities (or ability) towards use of images on the Web while touting their new flavour of artificial intelligence as a product.

Invictus01mo ago

old man shouts at cloud

1 more reply

ddp261mo ago

The second paragraph starts "Muse Spark is the first step on our scaling ladder and the first product of a ground-up overhaul of our AI efforts. To support further scaling, we are making strategic investments..."

This article is about Meta, not about the user. Who signs off on these? Is the intended audience other people at Meta, not the user?

tjkrusinski1mo ago

The article is published primarily to signal to the market that Meta is serious in its efforts to compete in building frontier ai models.

They want to 1) attract talent, 2) tell wall street they can play in this space as well, 3) help employees feel the company is moving in the right direction.

A frontier LLM doesn't apply to their core consumer products.

Lihh271mo ago

the blog is the product. investor deck posted as a tech launch

conradkay1mo ago

Stock up 9% today, very pleasant for Zuck if you do the math on his net worth :)

hungryhobbit1mo ago

I mean, kinda? It's not like Zuck is selling his stock tomorrow, so daily fluctuations in stock price don't really affect him.

throwaway1737381mo ago

He can borrow against that, so it actually does matter.

yalogin1mo ago

Meta is in a weird spot. They caught up late to the game and instead of releasing llama as a chat bot they open sourced it, precisely because they lost the mind share. They thought chatbot is not their product and I am sure they are regretting it now. Mark is obsessed with becoming the android of something and he poured billions into the metaverse thinking he is first and failed. He then open sourced llama and wanted to be the android of llms. He ended up enabling groq but it didn’t benefit meta directly at all. They have no revenue or mind share path from llms but continue to pour billions into it. The only 1-1 mapping is with the glasses but that is a tough fit for the company given they are extremely allergic to privqcy and security.

Not sure what this is now.

IceWreck1mo ago

> He then open sourced llama and wanted to be the android of llms.

Well the original llama did kick off the era of open source LLMs. Most original open source LLMs were based on the llama architecture. And look where we are now OSS modles are very close to frontier.

It may not have benefitted Meta but it commoditizatised LLMs.

solarkraft1mo ago

Hell, most of us are still using llama.cpp for inference in some form

btown1mo ago

> ended up enabling groq

For those reading fast, this isn't a reference to SpaceX's Grok, this is Groq.com - with its custom inference chip, and offerings like https://groq.com/blog/introducing-llama-3-groq-tool-use-mode... and https://console.groq.com/landing/llama-api

sunaookami1mo ago

Really liked Groq due to its speed but it seems like after Nvidia bought it it has been discontinued...

gardnr1mo ago

The llama weights were leaked. It open sourced itself.

You are right though. Meta could have been in lockstep releasing ChatGPT features into some chat bot on Facebook.com but instead it seemed like their FAIR arm was hell bent on commoditising this stuff by publishing their research models before the Chinese companies took the lead in that.

It’s hard for me to be mad at FAIR even though I general disagree with the outcomes that Meta produce for their users.

throwaw121mo ago

How is that Meta spent so much money for talent and hardware, but the model barely matches Opus 4.6?

Especially, looking at these numbers after Claude Mythos, feels like either Anthropic has some secret sauce, or everyone else is dumber compared to the talent Anthropic has

strulovich1mo ago

Meta did a bunch of mistakes, and look like Zuckerberg spent a lot of money on talent and made big swings to change it (that happened about a year ago)

I think it’s unrealistic to expect them to come back from that pit to the top in one year, but I wouldn’t rule them out getting there with more time. That’s a possible future. They have the money and Zuckerberg’s drive at the helm. It can go a long way.

solenoid09371mo ago

It's benchmaxxed.

If they actually matched Opus 4.6 on such a short timeline, it would have been mighty impressive. (Keep in mind this is a new lab and they are prohibited from doing distills.)

throwaw121mo ago

how do you know it's benchmaxxed?

solenoid09371mo ago

Friends at Meta with access to the model + personal experience at Meta.

Meta's performance process is essentially "show good numbers or you're out." So guess what people do when they don't have good numbers? They fudge them. Happens all across the company.

luma1mo ago

For one, they aren't using the latest version of many of the benchmarks. eg, ARC-AGI 2 and not 3, etc.

prodigycorp1mo ago

meta's benchmaxing tendencies are well known. llama4 was mega benchmaxxed, there's nothing that suggests to me that meta's culture has changed.

1 more reply

coffeebeqn1mo ago

Matching Opus 4.6 would be pretty good? It’s the SOTA actually available model

reissbaker1mo ago

Muse Spark doesn't even match GLM-5.1 on most benchmarks. And GLM is open source!

CuriouslyC1mo ago

Anthropic has just been focused on coding/terminal work longer mostly, and their PRO tier model is coding focused, unlike the GPT and Gemini pro tier models which have been optimized for science.

Their whole "training the LLM to be a person" technique probably contributes to its pleasant conversational behavior, and making its refusals less annoying (GPT 5.2+ got obnoxiously aligned), and also a bit to its greater autonomy.

Overall they don't have any real moat, but they are more focused than their competition (and their marketing team is slaying).

zozbot2341mo ago

Autonomy for agentic workflows has nothing to do with "replying more like a person", you have to refine the model for it quite specifically. All the large players are trying to do that, it's not really specific to Anthropic. It may be true however that their higher focus on a "Constitutional AI"/RLAIF approach makes it a bit easier to align the model to desirable outcomes when acting agentically.

CuriouslyC1mo ago

You think it has nothing to do with it. Even they only have a loose understanding of exactly the final results of trying to treat Claude like a real being in terms of how the model acts.

For example, Claude has a "turn evil in response to reinforced reward hacking" behavior which is a fairly uniquely Claude thing (as far as I've seen anyhow), and very likely the result of that attempt to imbue personhood.

impulser_1mo ago

It's not even on par with Sonnet. It's on par with open source models and it not even open source and sit behind a private preview API.

Might as well not release anything.

username2231mo ago

Facebook is working with the talent that can’t find a job at some other company. It doesn’t surprise me they ship mediocrity.

zozbot2341mo ago

> has some secret sauce

Yup, it's called test-time compute. Mythos is described as plenty slower than Opus, enough to seriously annoy users trying to use it for quick-feedback-loop agentic work. It is most properly compared with GPT Pro, Gemini DeepThink or this latest model's "Contemplating" mode. Otherwise you're just not comparing like for like.

throwaw121mo ago

> it's called test-time compute.

Why can't others easily replicate it?

coder681mo ago

I have not delved into the theory yet but it seems that the smaller open-source models do this already to an extent. They have less parameters, but spend much more time/tokens reasoning, as a way to close the performance gap. If you look at "tokens per problem" on https://swe-rebench.com/ it seems to be the case at least.

bguberfain1mo ago

We all know it... but I think they were very bold in this warning about using your private messages to train public models. _Your messages with AIs will be used to improve AI at Meta. Don't share information, including sensitive topics, about others or yourself that you don't want the AI to retain and use_

discopicante1mo ago

meta doesn't exactly instill confidence on using personal data responsibly. hard pass

anxtyinmgmt1mo ago

I wanted to root for Mark and Meta as another frontier lab especially focused on open source but at this moment I have to say who cares. Gemini has a better OS track record thus far. Alex Wang is a reputational hazard. It is hard to get over the bias that this too might be benchmaxxed. I'd love to see demos of products actually using these models to overcome that but with the current pace of progress now my intuition says skip all this.

gallerdude1mo ago

This would have been an amazing release 6 months ago. But the industry moves so fast, this is a trite release. Maybe it’s best for Meta to sell their superintelligence division. I don’t think Zuck’s vision is particularly compelling.

gordonhart1mo ago

A new model comparable (ish) to the Claude/Gemini/GPT flagships is a big deal for the industry and for Meta even if it doesn't set the new frontier.

gallerdude1mo ago

I’m not sure. If it was open source, certainly. But 4th place doesn’t really matter if you have nothing different to add.

lairv1mo ago

If the model is truly on par with Opus 4.6/Gemini 3.1/GPT 5.4 (beyond benchmarks) this still puts MSL in the frontier lab category, which is no small feat given that they pretty much rebooted last year

Many labs aren't able to keep up with the frontier, xAI, Mistral

datadrivenangel1mo ago

Fourth place means you're not reliant on any of the external providers for internal AI use, which is important for organizational health and negotiating with those other providers.

1 more reply

blahblaher1mo ago

Why would you use this instead of the other more proven models? Unless it's significantly cheaper. The general population mostly wants it free, and the more professional users are willing to pay for good/better responses.

NitpickLawyer1mo ago

You wouldn't use this as an API. You would "use" this inside the meta properties. Have a shop on fb marketplace? Now you have copy, images, support, chat, translations, erp, esp, fps and all the other acronyms :) and so on for your mom and pop shop @200$/mo. Probably worse than say claude/gemini but it's right there, one button away. "Click here to upgrade to AI++" or something.

1 more reply

gordonhart1mo ago

I won't use it, but I'm excited to see it for the same reason why I'm excited to see a near-frontier open-source release: more competition pushes prices down and reduces monopoly/cartel risk. I won't use Muse or Grok or GLM at this point but they're good for the ecosystem.

zozbot2341mo ago

Their new Contemplating mode gives this model a Deep Research ability (akin to existing models from GPT and Gemini) that might make it quite comparable to the just-announced Mythos.

solenoid09371mo ago

Mythos is a much bigger pre train, Contemplating is not the same thing.

1 more reply

temp_praneshp1mo ago

> might make it quite comparable to the just-announced Mythos

Do we have data to substantiate that claim?

dgellow1mo ago

I never understood why meta decided to join the race. They don’t sell compute like Google or Microsoft. Why not let others do the hard work and integrate their LLMs in your systems if needed? I assume it’s because they have Instagram, Facebook, WhatsApp, Thread data and feel they should be the ones using them for training, but it’s really not obvious how having a frontier AI lab benefits their business

observationist1mo ago

Adtech Money. They've got GPUs, they've got the infrastructure, and they've got the advertisement platform, and the point is getting AI that can exploit the adtech and create a flywheel effect, maximizing return from the data they collect from Insta, WhatsApp, Facebook, etc.

It's not just about LLMs, it's about being able to model consumers and markets and psychology and so on. Meta is also big in the manipulation side of things, any sort of cynical technological exploitation of humans you can imagine but that is technically legal, they're doing it for profit.

1 more reply

bachmeier1mo ago

> I never understood why meta decided to join the race.

I can think of at least two reasons. Price and customizability. If they train their own models on their own data, they potentially have a better model at a better price, and they're not at the mercy of Anthropic's decisions when they decide to raise prices. Additionally, if you use someone else's model, you use it the way they create it and permit you to use it. In a couple years, who has any idea how these models are used. Arguably, a company the size of Meta should be in control of their AI models.

1 more reply

chermi1mo ago

You basically have to be involved if you're meta. Even if there's only 5% chance this AI stuff is as disruptive as the labs claim it is, you can't afford to miss out. Even if you're lagging frontier, you must develop the competency internally. Otherwise you ignored a 5% chance of total annihilation, probably even exposing you to shareholder lawsuits.

eldenring1mo ago

Because there's a realistic chance this is the only important software technology moving forward, and commoditizes Metas's entire business which is software.

dgellow1mo ago

Meta’s business is human attention, human connections, and all derived data. They can use AIs for their systems, but the question is why do they feel the need to spend billions on training and running their own frontier model

xnx1mo ago

Zuck is trying to convince himself he's good, and not just lucky.

vinni21mo ago

From what I heard Meta is spending hundreds of millions each month in Claude credits for developers. So that’s a huge saving if they have own models that match Opus.

spindump89301mo ago

Spending tons of money on Claude and the recent token benchmarks came WELL after Meta's huge investments in compute infrastructure for AI as well as the long history of language model development inside science divisions at the company.

SoftTalker1mo ago

LLMs/Chat-based systems will reach a point where Facebook, WhatsApp, Threads, Instagram, etc. are all unnecessary. The idea of opening a browser or a specific app to do a thing will seem antiquated. You can do it all with your chat-based agent. Meta wants to be part of that.

operatingthetan1mo ago

I don't think everyone only wants to talk to machines going forward...?

1 more reply

dgellow1mo ago

Sure but they have the platforms, they don’t need their own frontier models for that

1 more reply

KaiserPro1mo ago

A few things:

1) meta was doing this at scale before openAI

2) decent ML is critical to catagorising content at scale, the more accurate and fast the category, the finer the recommendations can be (ie instead of woman, outside as a tag for a video, woman, age, hair colour, location, subjects in view, main subject of video, video style) doing that as fast as possible with as little energy as possible is mission critical

3) The llama leak basically evaporated the moat around openAI who _could_ have become a competitor

4) for the AR stuff, all of these models (and visual models) are required to make the platform work. They also need complete ownership so that it can be distilled to make it run on tiny hardware

5) dick swinging

6) they genuinely want to become a industrial behemoth, so robots, hardware, etc are now all in scope.

bee_rider1mo ago

I think they just want to be a winner in the “next thing.” They hit social networking, but missed mobile operating systems and didn’t compellingly win at social media. Eventually an ambitious person with a bazillion dollars wants a clear win, right?

storus1mo ago

Only thanks to Meta we have competitive local LLMs. Without LLama nothing decent would have been released. Commoditize your complements in action.

yoz-y1mo ago

AI NPCs to fill in the empty Metaverse?

aylmao1mo ago

First and most importantly is the fact they have a lot of very valuable data they wouldn't want to siphon to a competitor. This data is a key strategic asset in the space where they do business.

Secondly though, I think it has to do with the fact Meta is big enough to worry about vertical integration and full control of their business.

The whole reason they've been trying to make AR/VR happen for over a decade now is the assumption of a worst case and best case scenario. The worst case is Apple and Google wants them gone. This isn't as far fetched as it seems, Google has historically been Meta's biggest competitor and even tried to release its own social network back when Meta was threatening them. If either pulls Meta apps from their respective stores, it'd be an immense blow to Meta; their whole trillion-dollar business depends on competitor's platforms.

Meta tried making inroads into the phone business but failed; it is a very crowded market after all. So they changed their strategy. Instead of playing catch-up, they'd invent "the next iPhone" and be the first to a brand new market. This is the best case scenario; they invent a new platform where they can be dominant from day 1 and stop depending on competitor's hardware, not only removing that risk factor for them, but also unlocking a new market they can control.

AI ties into all this because it appears to be key for this next platform to happen. You will communicate with these smart glasses via voice, hand gestures, or subtle movements that a model will have to interpret. The features that could make them stand out as more than just a screen on your face are all AI related; object detection, world understanding, context awareness, etc. If all this were done via a 3rd party Meta would effectively be back on square one: a competitor could easily yank away its model access, or sell it to a competitor. Meta would be again at the mercy of others.

Compared to other big-tech players, I think it's easy to see how Meta is in a riskier position. There's little Google or Microsoft can do to kill the iPhone. There's little Apple or Google can do to kill Amazon's online store. There's little Amazon or Apple can do to kill Microsoft's business deals. Google and Meta are primarily in the business of capturing people's data, attention, and selling ads, and both Google and Apple could do quite some damage to Meta. Beyond expanding it, it's important for them to invest in ways to protect their money-printing machine.

gallerdude1mo ago

I’m sure there’s more to it than this, but it feels like Zuck has pet interests like VR and now AI.

alex11381mo ago

But no account support, that's boring

Or any quality control (people missing posts)

Or banning the people who should be banned while leaving everyone else alone

This is Zuck: https://news.ycombinator.com/item?id=4151433 or https://news.ycombinator.com/item?id=10791198

swyx1mo ago

you dont understand why zuck, who paid $1B for instagram when they had no revenue and 7 employees because he is paranoid about platform shifts, decided to join the race for (what is seeming highly possibly) the biggest platform shift in human history?

oceansky1mo ago

He also tried and failed to buy Snapchat, and then copied their feature on all their big products: Instagram, Facebook and even WhatsApp.

prodigycorp1mo ago

The way you put it, I understand it less. lol

johanyc1mo ago

One word: control. It's the same reason Facebook became Meta

chairmansteve1mo ago

Pumps up the stock price.

awestroke1mo ago

Because Zuck has chronic FOMO, he's said as much himself

addandsubtract1mo ago

To download all those torrents, obviously.

zeroonetwothree1mo ago

But then how will Zuck win the billionaire dick measuring contest?

throwaw121mo ago

> I don’t think Zuck’s vision is particularly compelling.

But he has to do it anyways, otherwise Meta can be disrupted easily.

Google, Apple has hardware, distribution channels for their products

Amazon has the marketplace and cloud

Microsoft has enterprise and cloud

Meta is always looking for ways to stay afloat

xnx1mo ago

Meta has 3.5 billion daily active users

throwaw121mo ago

and has competitors like: TikTok, SnapChat, YouTube, Netflix, X, HBO, Amazon Prime, all fighting for the attention time.

They are worried something like Sora can disrupt them quickly

eranation1mo ago

So this is why Anthropic rushed the weirdest "pre-responsible-disclosure-totally-not-for-marketing" announcement yesterday? To make sure Spark doesn't steal their thunder? (Spark beats Opus 4.6 on some benchmarks...). Or did I become a bitter cynical old man.

levocardia1mo ago

Anthropic had their mythos post (and model) basically ready a few weeks ago, as evidenced by the blog content leaks. Also I highly doubt they just threw together a 250-page PDF model card in a "rush."

hnav1mo ago

It's giving "OpenAI says its new model GPT-2 is too dangerous to release (2019)"

reducesuffering1mo ago

[because it would start an arms race]. The very arms race we're in... They were right

dbgrman1mo ago

Last i checked with friends at meta they are pretty deeply invested in using claude for coding etc. anthropic has nothing to be scared of at MSL.

If spark beats opus 4.6, why is meta wasting money on opus internally?

signatoremo1mo ago

13 days ago.

https://news.ycombinator.com/item?id=47538795

spindump89301mo ago

Yes, it's far more certain that meta released this, which is less convincing on evals, as a result of the mythos previews.

hvass1mo ago

Genuine question: Why release this the day after Mythos? It does not appear SOTA (just based on benchmarks). OpenAI will likely release Spud tomorrow.

paxys1mo ago

Mythos is a news article. This is an actual model you can use.

eranation1mo ago

That's a really good question, my sarcastic mind thinks that Anthropic rushed the Mythos announcement of fears of Meta stealing their thunder... (I guess someone leaked that, a LOT of anthropic folks are ex meta... so, you know)

Just a speculation, I have no real knowledge about it.

MattRix1mo ago

I think Anthropic did the mythos announcement to undercut OpenAI’s upcoming next model announcement, not Meta’s.

MattRix1mo ago

Why not? Not everything has to be SOTA to be interesting.

sidcool1mo ago

Will experiment with the model. But I am scared of sharing any information with the Zuck ecosystem.

GalaxyNova1mo ago

It is unfortunate that they decided to stop doing open-weight releases.

What could have been interesting has been reduced to simply another subpar LLM release.

toddmorey1mo ago

Question: since they've rebooted their approach to AI... have they given up on open models? There's no mention of open source or open weights or access to the models beyond their hosted services.

thegeomaster1mo ago

Alexandr Wang on Twitter [0] mentioned open source plans:

"this is step one. bigger models are already in development with infrastructure scaling to match. private api preview open to select partners today, with plans to open-source future versions. incredibly proud of the MSL team. excited for what’s to come!"

https://x.com/alexandr_wang/status/2041909388852748717

prodigycorp1mo ago

So the answer is: no. lol. Remember Llama 4 Behemoth, and how we were supposed to get more great models from it?

wmf1mo ago

This may be too large to run locally anyway. Maybe they will distill down some smaller open versions later.

gritspants1mo ago

I would like someone to tell me how stupid I am. If I were Meta/Zuck I'd open source a great model the moment my company developed it. This just looks like a pitch to investors, otherwise.

jamiequint1mo ago

"This just looks like a pitch to investors"

The goal of public companies is generally to generate profit for their investors.

samrus1mo ago

Im beginning to think thats the mantra we'll keep reciting as this whole country slowly falls apart

kzrdude1mo ago

pitch to investors sounds like working for the opposite goal though - to convince investors to give more money to the company.

SoftTalker1mo ago

This is also the goal of private companies.

gritspants1mo ago

Thank you for telling me how stupid I am.

khurdula1mo ago

"we hope to open-source future versions of the model."

Love to see it. Cheers!

edwcross1mo ago

What is the "BioTIER-refuse" thing mentioned in the "Bioweapons Refusal" graph?

I Googled it and found absolutely nothing.

Well, to be honest, I got 100% of websites containing the French word "boîtier" (box) with a typo.

Even on Google Scholar, the closest match is "BioTiER (Biological Training in Education and Research) Scholars Program", which is at least 10 years old and has nothing to do with that.

Is that an AI-generated image with an AI-generated name that has no physical existence?

EnderWT1mo ago

https://securebio.org/biotier/

binaryturtle1mo ago

Looks like it needs a meta account? As soon you hit enter it wants to log-in. I guess I won't try this any time soon. :)

zurfer1mo ago

> Muse Spark is available today at meta.ai and the Meta AI app. We’re opening a private API preview to select users.

m4r1k1mo ago

So no Open-weight .. why one would choose Muse Spark instead of Anthropic, OpenAI, or Google models all featuring from good to amazing harness?

ChrisArchitect1mo ago

Associated Meta news post with consumer-friendly takes: https://about.fb.com/news/2026/04/introducing-muse-spark-met...

spearman1mo ago

Uploading images requires logging in. Logging in is broken. It redirects to https://meta.ai/?error=Token%20exchange%20failed and doesn't show any error message. Impressive.

brianmcnulty1mo ago

It has been up and down today, specifically with authentication breaking. I also saw an error message with backend SQL in it (in my 6 years of Meta bug bounty security research, I have never once seen backend SQL before).

I suspect it is because they also refactored Meta AI entirely to use Next.js instead of their normal stack they use for literally everything else. Not sure why they would do this, but I guess it works (...or maybe not) for them.

visioninmyblood1mo ago

https://meta.ai/ this is where you can try it seems like the API is not publicly accessable yet. I feel they are very late to the game and do not show value to customers over other models.

p_stuart821mo ago

late isn't the problem. private preview api and no reason to switch. that's just another hosted model

cvhc1mo ago

Can't login. No error message in the UI. But the URL changes to "https://www.meta.ai/?error=Token%20exchange%20failed".

nh23423fefe1mo ago

same. closed tab and will forget to ever use it now

cvhc1mo ago

I switched to Chrome (from Firefox) and tried again. Now it's "https://www.meta.ai/?error=Invalid%20CSRF%20token" :facepalm:

nharada1mo ago

Saying nothing about the actual performance of this model, it does strike me how .... minimal(?) this announcement is. Their safety section is like 2 paragraphs about bioweapons. Go look at the reports for OpenAI and Anthropic's model releases. It's like 50+ pages of tests, examples, reports, and benchmarks across a bunch of safety and wellfare metrics.

If Meta wants to be seen as a cutting edge massive lab they need to come across as one instead of looking like a school project version of a frontier model.

WarmWash1mo ago

Rumor on the ground is that they expected a much stronger model than this one.

levocardia1mo ago

Funny contrast with Anthropic. Ant does a "hero run," gets a model much more powerful than they expect. Meta does a hero run, gets a model much more mediocre than they expect. Read into this what you will, I guess?

nubg1mo ago

Can you elaborate?

WarmWash1mo ago

That's it. It's just a rumor. A model, which I don't even know of it's this one specifically, fell short of expectations. This rumor came up around mid March.

htrp1mo ago

llama4 behemoth problems?

KoolKat231mo ago

Perhaps I'm wrong, but definitely seems to be SOTA. Although looking at it's ARC-AGI-2 score it's reasoning isn't very good. I suspect it's got the benefits of scale but lacks that human added element, understandable considering they claim to be building it from the ground up. This should come in time if they have a good team. In real life, I'd imagine one would worry about overfitting when using it.

(I'm not using it as I'm not agreeing to their ad terms).

chankstein381mo ago

Personal Superintelligence made me think this was an open-source model being released and I was excited. Then I continued reading and I'll just wait until the model comes out.

dbgrman1mo ago

I wonder if Zuck will ever internalize that the words ‘personal’ and ‘meta’ will not be taken seriously together for another decade (if they don’t make another gaff).

gardnr1mo ago

I was really excited until I realised that “personal” meant “owned by meta“.

I’m trying to decide is I find the doublespeak a bit offensive or not.

eranation1mo ago

Sarcasm aside, tried it (with instant mode), it's an impressive model.

It nailed all the ChatGPT meme gotchas (walk to the carwash, Alice 50 brothers, upside down cup, R's in strawberry, which number is bigger, 9.11 or 9.9?)

I guess all that money poaching OpenAI / Anthropic talent went somewhere...

Now, would I use "Meta Muse Code" or "Muse CoWork" if I have to have a facebook account to all of my developers? Maybe not.

Would I use it via an API key? I might, depends on the pricing!

turtlesdown111mo ago

so since they hard programmed all of the meme gotchas, they built a good model?

nh23423fefe1mo ago

lazy snark < playing around with it

supermatt1mo ago

Does "personal" here mean "run the model on your personal hardware", or just "give your personal data to meta"?

anigbrowl1mo ago

Kinda off topic but I wonder why they picked this name, knowing of Nvidia's Spark. They're different products, obviously, but the potential for confusion is real as both brands are competing for mindshare in the AI space. I opened this story expecting to read they'd deployed on a cluster made of Spark machines or somesuch.

supermdguy1mo ago

And also OpenAI’s codex spark?

maxaravind1mo ago

Personal superintelligence sounds nice until you actually try to use it.

We spent time yesterday arguing through an architecture decision. Today I ask the Agent to help implement it - it knows nothing about any of that. You’re effectively starting over.

Feels like the real problem isn’t intelligence, it’s continuity. And most benchmarks don’t even touch that.

manyminds1mo ago

Yes this feels very new from a product and harness design perspective but it's brand new! Nine months old. The mobile and web sessions don't even real-time sync between each-other yet there's endless work to be done and time will tell if they can bring all the people to bring it together. The underlying model seems like a great foundation now but securing the supremacy of usage is multilateral requiring both machine learng advancements and product/harness/usage design.

khalic1mo ago

Oh good, if they built a lab, I’m sure they took the time the precisely define what they mean by super intelligence? Right? …

52-6F-621mo ago

If this is super intelligence, then it follows we must all be super-duper intelligence.

gardnr1mo ago

It’s personal…

Alifatisk1mo ago

Do we have any numbers on input, output and conversation context window limit?

I tried multiple riddles, graphs and questions I know some LLMs fails at, but this one seems to do well. But I still don't have much trust in Meta after the scandal of them fiddling with their previous models to look good.

rvz1mo ago

Until you actually try the model itself, assume any benchmark presented to you as being part of the marketing material of the model, as it is not independently verified and completely biased.

The same is true with any other model, unless otherwise stated.

In the next few days, we'll see who Meta has paid to promote this model on social media.

oliver2361mo ago

so glad its beating all the others on bioweapons refusal. this is what i most wanted out of the latest SOTA model

wmf1mo ago

Zuck has a lot more experience being summoned before Congress than you.

1 more reply

Artgor1mo ago

I'm cautiously waiting for the feedback from the first users. Meta has produced a lot of great models (LLama), maybe this is a comeback... but I'm cautious, as the jump in the quality is almost too high.

Also, I think people aren't used that using such models requires meta.ai or meta ai app.

conradkay1mo ago

It doesn't seem benchmaxxed, ARC AGI 2 score is quite bad (42.5%, GPT 5.4 is 76.1%) and coding is okay. But maybe this is the best Meta can do even benchmaxxing

The impressive part is multimodality, very plausible since there's less focus there by other labs (especially Anthropic)

solenoid09371mo ago

My Meta friends say it's benchmaxxed af

loeg1mo ago

We used to call this "overfitting," but I suppose everything has to be maxxed now. Fitmaxxed?

dbgrman1mo ago

Given llama 4 mucked up benchmark numbers, I’d take spark announcement with a many grains of salt.

RestlessAPI1mo ago

Token cost really matters here. I want too know what API pricing is. As we see, this model is like 85% as good as the frontier models? What if its priced at $0.2 in / $0.5 out Mtok? All of a sudden, this model is A LOT more appealing to me.

pixel_popping1mo ago

Meta back in the commercial race is actually exciting, despite not being a fan of the company.

syntaxing1mo ago

Kinda crazy, it really felt like Meta had the lead in LLMs, especially during the early LLaMa days. What happened for them to fall so far behind? I don’t get how LLaMa 4 was such a big train wreck and they couldn’t correct the course like Google.

2001zhaozhao1mo ago

The "AIME Evolution" graph seems interesting. I wonder if other labs are doing this too to improve the reasoning performance of their models.

> Think longer to solve harder problems > Compress > Think longer again

adt1mo ago

Congrats to the Meta team on being model #800 on the Models Table, I suppose.

https://lifearchitect.ai/models-table/

spprashant1mo ago

Sounds like a good effort. They are choosing to focus on multi-modality - perhaps they are taking a different route here to Anthropic.

I don't like that I need to login to my FB/Instagram account to access this.

plombe1mo ago

Looks like a lightweight article. But memory usage went from 316MB -> 502 MB when I hit refresh. Not sure why? Any one have any ideas? Why does it need half a gig of ram in the first place?

btown1mo ago

Benchmarks are meaningless until the pelican benchmark comes out: https://simonwillison.net/

leumon1mo ago

pelican riding a bicycle (svg): https://files.catbox.moe/u5yc0x.png

santiagobasulto1mo ago

This looks like a very interesting model and very promising, especially after llama lost so much ground recently. I hope they release the weights

chrsw1mo ago

So Meta is not releasing open source models anymore?

johanyc1mo ago

They said they are in the tweet

chrsw1mo ago

Thanks. That's blocked for me.

try-working1mo ago

Looks alright for a "first" but there's no reason for anyone to really use until they open source it.

redlewel1mo ago

I am already somewhat concerned with companies like Anthropic and especially OpenAI having personal data via chats. Typing that sort of information into a Meta AI product feels completely irresponsible. You could make some very sophisticated ads/psyop attacks with data from daily ai chats.

I doubt its better than Opus and even if it was its not worth the privacy concerns.

voidUpdate1mo ago

What makes this "superintelligence" instead of regular artificial intelligence?

leentee1mo ago

Look at their benchmark charts to understand how desperate they're. A lame duck now.

napolux1mo ago

I can't login. It sends me always the same code and it's not correct for them

nathan_compton1mo ago

Their product could literally teleport gold into my hands and I wouldn't use it.

LZ_Khan1mo ago

One word: distillation

vinni21mo ago

I have to create meta account to access. No thanks.

dbgrman1mo ago

Litmus test: what % of meta engineers are using muse vs Claude code? Last i heard it was mostly claude code. Tell you everything you need to know about how serious these benchmarks are.

upmind1mo ago

Sure it's not as good as Claude right now but for their first model in years it's certainly not bad. I hope they continue to develop models, having another competitor in the space would be nice.

dhruvyads1mo ago

Sad to see it's not going to be open source.

warthog1mo ago

Hoping the benchmarks are correct this time...

htrp1mo ago

Anyone done vibe testing at meta ai yet?

BugsJustFindMe1mo ago

I'm struck by all these independent announcements saying "look at our new model that we only spent $N Billion in acquisitions and hardware time to build and operate that's just like those other ones but this one is ours." Because if any of these companies would simply pool resources and work together, and if the government actively participated in providing funds, they'd be able to accelerate AI so much faster. It all feels incredibly wasteful. But I guess that's communism or something.

victorbjorklund1mo ago

Competition often foster innovation. Why are they innovating so fast and spending so much money? Because they don’t wanna get behind. If there was no competition at all then there would be much less reason to innovate and spend resources.

BugsJustFindMe1mo ago

> Competition often foster innovation.

So does cooperation in any framework that values public good over pure obedience to an inherently-abusive late stage capitalism. I know that's passé in a world where the US government no longer believes in funding science, and yet.

Competition is also inherently wasteful. And if you're talking about wasting a few K or a few Mil here or there, fine, whatever. But here we're talking about waste on the order of trillions of dollars at the end of the day.

ComputerGuru1mo ago

So does this confirm the end of llama?

jansport1231mo ago

did they just copy the chatgpt ui?

damian_pol1mo ago

I hate that they ask to log in with facebook/instagram account. I tried to create a new one with proton's hide-my-email and it got suspended 30 seconds later. When I tried to log in they require a selfie proving that I am not a robot. Ridiculous that in order to use dev tool you need to link it to social account or send a selfie

nubg1mo ago

NOTHING about this is personal! No weights were released!

thebiggestloser1mo ago

why is it behind a login? Such bad UX.

senor261mo ago

who trusts meta on anything!!

Kuyawa1mo ago

> Meta AI isn't available yet in your country

Not my loss, will keep using DeepSeek then. Wake me up when my country is no longer in the wrong/right side of history.

OsrsNeedsf2P1mo ago

The only benchmark they show against SOTA models is in bioweapons refusal.

Edit: nvm I can't read, regular benchmarks against SOTA are there

sidcool1mo ago

Meta.ai has muse spark

ge961mo ago

funny how websites do that thing where it looks like you can use the product but soon as you hit enter, nope login first

RandyOrion1mo ago

No open weights.

Besides, I'm old enough to recall that META has trained a version of LLAMA 4 specifically for LM arena elo benchmaxxing and PR things, and proceeded to release a different version of LLAMA 4.

ehutch791mo ago

How's the metaverse doing? It was the next big thing and how we're all going to be working inside it in... was it like 3 months ago?

Maybe they need to mine more libra coin first? or is it diem now? is that even still part of meta?

I'm sure this new AI is super intelligent and super awesome and will be writing all the code, making all the blog posts, and generating all our youtube shorts in 6 months.

serf1mo ago

what's with the negativity?

yeah, the metaverse got abandoned. Also: Meta was the only one to try the concept for the past X-umpteen years even though everyone in the industry ga-gas over virtual reality worlds and workplaces at every opportunity. It's literally Meta and Linden Labs (which has been on life support for 10+ years.)

The alternative is : no one does it and nothing gets abandoned, which the industry has shown itself to be exceedingly good at w.r.t VR for the past 40+ years.

To be clear: I have no faith in meta as a company; my problem lies in kicking an entity because they attempted something different.. I don't think that's productive, and it produces stuff like the past AI winters because groups get afraid of touching experimental concepts ever again lest they incur the wrath of the shareholder.

ehutch791mo ago

It's not the failure here or there, it's a pattern. It's not even the failing, it's the excessive hype cycle.

We keep seeing things being overhyped, with not much thought behind it. Meta is particularly bad about it. They changed their name for the hype of their VR product, when VR was still niche and had a long way to go, and still does. They couldn't even figure out legs for launch.

Now they have a 'superintellegence'? Yeah, that sounds like just the latest in a line of bullshit. Why would this be different.

sva_1mo ago

> Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.

https://news.ycombinator.com/newsguidelines.html

ehutch791mo ago

Establishing a pattern of over hyping of projects that then disappear isn't a shallow dismissal.

captn3m01mo ago

Libra/Diem got sold to the bank they were partnering with (Silvergate) for $200M, which then filed for Bankruptcy.

https://en.wikipedia.org/wiki/Diem_(digital_currency)

1970-01-011mo ago

I can remember when AOL was an unstoppable giant. Except it wasn't. People eventually realized they could get a better, cheaper, faster experience with ISPs and search engines. The same path is unfolding before Meta. People have much better options, and plethora of Meta users will slowly leave until the big moat is drained. Zuck, go retire to your NZ bunker before Meta is forced to merge with another media company.

j / k navigate · click thread line to collapse

367 comments

tty4561mo ago

prodigycorp1mo ago

refulgentis1mo ago

> 4. This model was out in the woods as early as like a couple months ago but they didn't release it because it was at gemini 2.5 pro levels.

Source? (Even if rumor)

nl1mo ago

NYTimes had a story about this (March 12):

> They added that the leaders of Meta’s A.I. division had instead discussed temporarily licensing Gemini to power the company’s A.I. products, though no decisions have been reached.

https://www.nytimes.com/2026/03/12/technology/meta-avocado-a...

https://archive.is/uUV5h#selection-715.98-715.277

1 more reply

prodigycorp1mo ago

It was from a techmeme ride home podcast where the host discussed "sources at the company said". I don't remember which day's episode it was.

zozbot2341mo ago

dilap1mo ago

Deepseek R1 was a publically-available, MoE model that was getting a ton of attention before llama4. Llama4 didn't get much attention because it wasn't good.

1 more reply

prodigycorp1mo ago

the models were objectively horrible

1 more reply

canes1234561mo ago

Why go into coding agents? Both anthropic and OpenAI are going all in on that. The opportunity is customer facing AI now.

OpenAI has the mindshare but they going to have to decide if they allocate their limited compute for free users or go all in trying to keep up with Anthropic in enterprise.

kaycey20221mo ago

you can do way more than just coding with the coding agents.

foobiekr1mo ago

Because coding agents are where the revenue is.

refulgentis1mo ago

If you squint at coding agents you see the next OS.

Maybe better phrasing is “HCI paradigm”, but that somehow manages to say everything and nothing.

2 more replies

modeless1mo ago

ai5iq1mo ago

andai1mo ago

Is there a benchmark for these long tasks? That kind of seems like the only number worth measuring.

(Of course at that point it involves memory and context management and so on, so you're testing the harness as well as the model.)

1 more reply

redox991mo ago

> If it slightly beats or even matches Opus 4.6

It doesn't though

ryeguy_241mo ago

Curious on why you think this. Any data points that led you to this?

howdareme1mo ago

The benchmarks they released

1 more reply

ChipopLeMoral1mo ago

> I don't get the comments trashing this.

People like to hate on Meta regardless of anything, and regardless of whether it's justified or not. Not saying it isn't, just that it's many people's default bias.

jatora1mo ago

That is not the case here. Nobody hated on llama 1,2,3 at all. They justifiably felt burned by the benchmaxxing of llama 4. Trust broken must be re-earned, and benchmarks alone cannot do that.

blazespin1mo ago

asdfman1231mo ago

> people are getting paid to post and upvote/downvote narratives

This problem will be solved shortly with better AI (if it hasn't essentially been solved already).

No more humans in the loop, much lower costs for social media manipulation. Welcome to the future!

simonw1mo ago

Pelicans: https://simonwillison.net/2026/Apr/8/muse-spark/

wsgeorge1mo ago

Alexandr Wang suggesting this might be open-weights/source in the future gives me hope. Hopefully they stay on this path.

lemonish971mo ago

I have a feeling it won't be this exact model, but rather smaller distilled variants, similar to the gemma line

sbinnee1mo ago

sunaookami1mo ago

Seems like not all tools are available everywhere? Don't have access to visual_grounding sadly, only these: https://embed.fbsbx.com/playables/view/4208761039384112/?ext...

simonw1mo ago

Interesting, you got some I didn't: animate image, create video and get reference audio.

nickvec1mo ago

The only benchmark I care about! Just curious Simon - which model do you think has created the best pelican riding a bicycle thus far?

simonw1mo ago

Gemini 3.1 Pro: https://simonwillison.net/2026/Feb/19/gemini-31-pro/

But GLM-5.1 has the best NORTH VIRGINIA OPOSSUM ON AN E-SCOOTER: https://simonwillison.net/2026/Apr/7/glm-51/

sbinnee1mo ago

> but you can try it out today on meta.ai (Facebook or Instagram login required).

I guess I will have to wait. I hope at least soon it will be available on Openrouter. Overall, I am really excited to try it out.

daft_pink1mo ago

This really reinforces the idea that the AI race and the Railroad Mania of the 19th century are very similar.

So many different companies are going to have similarly powerful ai that there will be no moat around it and it will be cheap. They will never earn their investment back.

cheriot1mo ago

netcan1mo ago

Anthropic generally seem more into living within market discipline and market signals of some sort. Products with margins, even if it's sort of irrelevant considering R&D costs and capital inflow.

That said, there's nothing like the real thing.

The risk is something like the railroad bubble and the dotcom. Over-investement, circular revenue and a timeline that doesn't work.

Or, maybe it'll work out.

c3fxx1mo ago

The whole premise is based on the fact that over-investing in GPUs and models are a good thing here as it yields more 'intelligence'.

This as it turned out was not true for rail roads - more and more rail roads isnt a good thing.

andai1mo ago

The weird position they find themselves in now is that they have to keep making it smarter... but they already made it too smart (Mythos). I'm not sure how that's going to work out exactly.

2 more replies

throwaway1737381mo ago

Maybe they’ll figure out how to make an agent train an agent.

2 more replies

andai1mo ago

Well all of them are already in bed with the government, so they're going to find themselves with slightly more assistance than a free market would predict.

holoduke1mo ago

christina971mo ago

vidarh1mo ago

teiferer1mo ago

> Everybody is talking about ai. Everybody is using it.

1 more reply

Eufrat1mo ago

Based on what? A lot of this is vibes and FOMO; just like any economic bubble.

The real shame is this mania drowns out serious, practical use cases because when the bubble collapses, the market will throw the baby out with the bathwater.

olelele1mo ago

You can do anything at zombo.com!

atleastoptimal1mo ago

How can you look at Anthropic's revenue chart and claim it's just vibes

1 more reply

naasking1mo ago

> Based on what? A lot of this is vibes and FOMO; just like any economic bubble.

You're in a bubble.

https://www.helpnetsecurity.com/2026/04/07/google-llm-conten...

dist-epoch1mo ago

The moat is in the compute and the energy access.

And further down the line in chips, which is why Elon is building a fab now.

There are plenty of capable models on HuggingFace, yet I have no way of running them.

khalic1mo ago

Give it a few years, or month. Tiny models are getting outrageously good

spprashant1mo ago

I wonder if this is why the tech cartel is buying up all the hardware?

If the average user gets convinced they could run LLMs for cheap at home, you cannot trap users in your walled garden anymore.

1 more reply

mobattah1mo ago

Exactly. We’ll see the cost of AI continue to drop.

I was saying this for years about Tesla’s FSD - they finally had to give in and drop the price to stay competitive.

1 more reply

cedws1mo ago

That fab will never be delivered. In five years you might see the manufacturing equivalent of a person dancing in spandex.

QQ001mo ago

The only company that musk own and actually achieve something is spacex. so I believe you. He likes to hype things beyond what is actually possible.

spacex is engineering masterpiece with how they revolutionize the space industry.

1 more reply

nutjob21mo ago

> which is why Elon is building a fab now

At least he says he's doing that. It doesn't really make sense since you're not going to achieve an advanced node from a standing start in a practical time frame and cost.

Sounds like more Musk flavored vapor.

re-thc1mo ago

> It doesn't really make sense since you're not going to achieve an advanced node from a standing start in a practical time frame and cost.

They already announced a partnership with Intel.

2 more replies

mirekrusin1mo ago

What people seem to miss is that they don't need to get the investment back from people, they will get it from machines.

AnimalMuppet1mo ago

Could you explain how you think that's going to work? Because to me it seems that until machines have bank accounts, there's no money for them to get.

mirekrusin1mo ago

ValentineC1mo ago

Not the parent, but I guess that if AGI happened and was competent enough to trade markets, they'd earn the company back their investment in a short period.

1 more reply

girvo1mo ago

People are the one with money, though, at the end of the day.

jatora1mo ago

you dont need money if you have resources and manpower

2 more replies

_2d301mo ago

Ran some of my internal benchmarks against this and I'm very unimpressed. I don't think this moves them into the OAI v Anthropic v Gemini conversation at all.

Major analytical errors in their response to multiple of my technical questions.

_2d301mo ago

smlacy1mo ago

Post actual results, make a blog post. Don't just say "this sucks" without tangible evidence.

Otherwise you're doomed to "sample size of one" level of relevance.

thorum1mo ago

titanomachy1mo ago

Then your internal benchmarks will be in the post-training set and you’ll have to make new ones.

_2d301mo ago

I may already have but I'm pseudonymous on this website.

1 more reply

mliker1mo ago

It’s quite good for multimodal cases that 3 billion people would use it for though it lags in scientific areas

_2d301mo ago

Yes, this would make sense for what Meta might focus on.

jatora1mo ago

even gemini is not in that conversation

gloosx1mo ago

>Text field.

>"Ask Meta AI..." placeholder.

>Colourful blue Send button.

>Eager to try, entering question... hitting Send.

>Log in or create an account to access.

>15 seconds of loading time

>Continue with Facebook or Instagram

Typical meta move, throwing a dark pattern at you from the beginning instead of just letting you try it

Won't even bother to continue, somehow OpenAI got this right.

laser1mo ago

zmmmmm1mo ago

jatora1mo ago

no need for an open harness when anthropic so kindly gifted the community theirs :)

granzymes1mo ago

Poor Grok 4.2 should probably be dropped from the table.

fancy_pantser1mo ago

zozbot2341mo ago

deanc1mo ago

Grok code was my daily driver for months while it was free and it was fantastic - it is certainly no worse than it was a few months ago.

flufluflufluffy1mo ago

Why do you need to ask any AI questions regarding your health every day?

1 more reply

glerk1mo ago

Personal as in Meta gets your personal data so they can sell you more ads.

flufluflufluffy1mo ago

And slowly siphon your personal essence away from yourself and into the model.

CrzyLngPwd1mo ago

If I'm a claw, then they can send me as many ads as they like.

TobTobXX1mo ago

> Muse Spark is a natively multimodal reasoning model with support for [...] visual chain of thought [...].

Do they mean "the chain of thought is visible to the user" (ie. not hidden like ChatGPT), or "the medium of the chain of thought is not text, but visuals" (ie. thinking in images).

fc417fc8021mo ago

seanhunter1mo ago

I don't know what you mean by that. We know what's going on under the hood always: linear algebra, the attention mechanism etc.

[1] Which used to bring about very substantial improvements in performance on some tasks

1 more reply

rain-princess1mo ago

tekacs1mo ago

https://meta.ai/share/pe4HxOfv2Bp

Definitely getting ordinary high-quality results, overall. But hard to test agentic behavior and hard to test prose quality, even, when just working off of the default chat interface.

One thing that stands out is that _for_ the quality it feels very, very fast. Perhaps it's just only very lightly loaded right now, but irrespective it's lovely to feel.

I'm quite impressed with the tone overall. It definitely feels much more like Opus than it does, like, GPT or Grok in the sense that the style is conversational, natural and enjoyable.

nl1mo ago

This seems pretty good.

moab1mo ago

"Muse Spark is available now, and Contemplating mode will be rolling out gradually in meta.ai."

meetpateltech1mo ago

"It will be available in private preview via API to select partners, and we hope to open-source future versions of the model."

from Facebook Newsroom: https://about.fb.com/news/2026/04/introducing-muse-spark-met...

tempaccount4201mo ago

I can't think of any "select partners" that would want to use this non-SOTA model. Just put it on OpenRouter.

giancarlostoro1mo ago

If Microsoft is a select partner, maybe they could shove it into Copilot for VS or something, but yeah, I'm wondering the same, maybe Apple could be one of their partners too?

pstuart1mo ago

mark_l_watson1mo ago

monkeydust1mo ago

TBD it seems. So far the only explained usage pattern is through a Meta product (Whatsapp, Facebook, Instagram).

moab1mo ago

So to verify their claims and see how strong these models are, the answer is "believe us"?

Note: I'm expressing some skepticism here largely due to how recent rollouts from Meta flopped. Sincerely hoping that they do better this time around!

nemomarx1mo ago

I assume the answer is try it out in the chat mode? You could run your usual benches through that right

hackrmn1mo ago

The hero image on the linked page, which consists of a muted teal background with the words "Introducing Muse Spark", weighs in at 3,5MB. I don't even...

KerrickStaley1mo ago

"Please don't complain about tangential annoyances—e.g. article or website formats, name collisions, or back-button breakage. They're too common to be interesting."

- Hacker News Guidelines https://news.ycombinator.com/newsguidelines.html

gobdovan1mo ago

It's at least Meta-relevant. Compression Represents Intelligence Linearly (Y Huang, 2024)

sumedh1mo ago

Such complaints are valid for AI model releases, that tells us that they are not using their own models to test their own release pages.

ValentineC1mo ago

Maybe they did get their models to test their pages, but they didn't tell their models to pretend that they're browsing on mobile using a 3G connection.

yawnxyz1mo ago

I think this speaks to the product release iself

fleabitdev1mo ago

girvo1mo ago

...and so it's stuck, two decades on haha

Overpower04161mo ago

lol it literally took me 2s to google search "optimize image for website" and 10s to upload and get a smaller sized image.

The result for that specific image is: 500kb. 85% decrease in size

BugsJustFindMe1mo ago

An indistinguishable JPG is 170KB. An SVG would be 20KB.

levocardia1mo ago

CSS with a linear gradient background would be even smaller :)

sofixa1mo ago

You can even automatically do that on your CDN/delivery/web server layer. Or as part of your web deployment pipeline.

Overpower04161mo ago

Yes, but it might be a little too advance for Meta ;)

1 more reply

hungryhobbit1mo ago

Someday our robot overlords will be intelligent enough to ... optimize images!

(But today is not that day.)

ruszki1mo ago

The proper optimization in this case is to not use images at all.

kzrdude1mo ago

For me it's 213 kB. Did they replace it?

1 more reply

zfol_5101mo ago

And it doesn't even look high-res.

Invictus01mo ago

complaining about sand on the beach

fooqux1mo ago

It's not sand on the beach, it's garbage on the beach.

hackrmn1mo ago

I am simply offended. By Meta's lack of sensibilities (or ability) towards use of images on the Web while touting their new flavour of artificial intelligence as a product.

Invictus01mo ago

old man shouts at cloud

1 more reply

ddp261mo ago

This article is about Meta, not about the user. Who signs off on these? Is the intended audience other people at Meta, not the user?

tjkrusinski1mo ago

The article is published primarily to signal to the market that Meta is serious in its efforts to compete in building frontier ai models.

They want to 1) attract talent, 2) tell wall street they can play in this space as well, 3) help employees feel the company is moving in the right direction.

A frontier LLM doesn't apply to their core consumer products.

Lihh271mo ago

the blog is the product. investor deck posted as a tech launch

conradkay1mo ago

Stock up 9% today, very pleasant for Zuck if you do the math on his net worth :)

hungryhobbit1mo ago

I mean, kinda? It's not like Zuck is selling his stock tomorrow, so daily fluctuations in stock price don't really affect him.

throwaway1737381mo ago

He can borrow against that, so it actually does matter.

yalogin1mo ago

Not sure what this is now.

IceWreck1mo ago

> He then open sourced llama and wanted to be the android of llms.

Well the original llama did kick off the era of open source LLMs. Most original open source LLMs were based on the llama architecture. And look where we are now OSS modles are very close to frontier.

It may not have benefitted Meta but it commoditizatised LLMs.

solarkraft1mo ago

Hell, most of us are still using llama.cpp for inference in some form

btown1mo ago

> ended up enabling groq

sunaookami1mo ago

Really liked Groq due to its speed but it seems like after Nvidia bought it it has been discontinued...

gardnr1mo ago

The llama weights were leaked. It open sourced itself.

It’s hard for me to be mad at FAIR even though I general disagree with the outcomes that Meta produce for their users.

throwaw121mo ago

How is that Meta spent so much money for talent and hardware, but the model barely matches Opus 4.6?

Especially, looking at these numbers after Claude Mythos, feels like either Anthropic has some secret sauce, or everyone else is dumber compared to the talent Anthropic has

strulovich1mo ago

Meta did a bunch of mistakes, and look like Zuckerberg spent a lot of money on talent and made big swings to change it (that happened about a year ago)

solenoid09371mo ago

It's benchmaxxed.

If they actually matched Opus 4.6 on such a short timeline, it would have been mighty impressive. (Keep in mind this is a new lab and they are prohibited from doing distills.)

throwaw121mo ago

how do you know it's benchmaxxed?

solenoid09371mo ago

Friends at Meta with access to the model + personal experience at Meta.

Meta's performance process is essentially "show good numbers or you're out." So guess what people do when they don't have good numbers? They fudge them. Happens all across the company.

luma1mo ago

For one, they aren't using the latest version of many of the benchmarks. eg, ARC-AGI 2 and not 3, etc.

prodigycorp1mo ago

meta's benchmaxing tendencies are well known. llama4 was mega benchmaxxed, there's nothing that suggests to me that meta's culture has changed.

1 more reply

coffeebeqn1mo ago

Matching Opus 4.6 would be pretty good? It’s the SOTA actually available model

reissbaker1mo ago

Muse Spark doesn't even match GLM-5.1 on most benchmarks. And GLM is open source!

CuriouslyC1mo ago

Anthropic has just been focused on coding/terminal work longer mostly, and their PRO tier model is coding focused, unlike the GPT and Gemini pro tier models which have been optimized for science.

Overall they don't have any real moat, but they are more focused than their competition (and their marketing team is slaying).

zozbot2341mo ago

CuriouslyC1mo ago

You think it has nothing to do with it. Even they only have a loose understanding of exactly the final results of trying to treat Claude like a real being in terms of how the model acts.

impulser_1mo ago

It's not even on par with Sonnet. It's on par with open source models and it not even open source and sit behind a private preview API.

Might as well not release anything.

username2231mo ago

Facebook is working with the talent that can’t find a job at some other company. It doesn’t surprise me they ship mediocrity.

zozbot2341mo ago

> has some secret sauce

throwaw121mo ago

> it's called test-time compute.

Why can't others easily replicate it?

coder681mo ago

bguberfain1mo ago

discopicante1mo ago

meta doesn't exactly instill confidence on using personal data responsibly. hard pass

anxtyinmgmt1mo ago

gallerdude1mo ago

gordonhart1mo ago

A new model comparable (ish) to the Claude/Gemini/GPT flagships is a big deal for the industry and for Meta even if it doesn't set the new frontier.

gallerdude1mo ago

I’m not sure. If it was open source, certainly. But 4th place doesn’t really matter if you have nothing different to add.

lairv1mo ago

Many labs aren't able to keep up with the frontier, xAI, Mistral

datadrivenangel1mo ago

Fourth place means you're not reliant on any of the external providers for internal AI use, which is important for organizational health and negotiating with those other providers.

1 more reply

blahblaher1mo ago

NitpickLawyer1mo ago

1 more reply

gordonhart1mo ago

zozbot2341mo ago

Their new Contemplating mode gives this model a Deep Research ability (akin to existing models from GPT and Gemini) that might make it quite comparable to the just-announced Mythos.

solenoid09371mo ago

Mythos is a much bigger pre train, Contemplating is not the same thing.

1 more reply

temp_praneshp1mo ago

> might make it quite comparable to the just-announced Mythos

Do we have data to substantiate that claim?

dgellow1mo ago

observationist1mo ago

1 more reply

bachmeier1mo ago

> I never understood why meta decided to join the race.

1 more reply

chermi1mo ago

eldenring1mo ago

Because there's a realistic chance this is the only important software technology moving forward, and commoditizes Metas's entire business which is software.

dgellow1mo ago

xnx1mo ago

Zuck is trying to convince himself he's good, and not just lucky.

vinni21mo ago

From what I heard Meta is spending hundreds of millions each month in Claude credits for developers. So that’s a huge saving if they have own models that match Opus.

spindump89301mo ago

SoftTalker1mo ago

operatingthetan1mo ago

I don't think everyone only wants to talk to machines going forward...?

1 more reply

dgellow1mo ago

Sure but they have the platforms, they don’t need their own frontier models for that

1 more reply

KaiserPro1mo ago

A few things:

1) meta was doing this at scale before openAI

3) The llama leak basically evaporated the moat around openAI who _could_ have become a competitor

4) for the AR stuff, all of these models (and visual models) are required to make the platform work. They also need complete ownership so that it can be distilled to make it run on tiny hardware

5) dick swinging

6) they genuinely want to become a industrial behemoth, so robots, hardware, etc are now all in scope.

bee_rider1mo ago

storus1mo ago

Only thanks to Meta we have competitive local LLMs. Without LLama nothing decent would have been released. Commoditize your complements in action.

yoz-y1mo ago

AI NPCs to fill in the empty Metaverse?

aylmao1mo ago

First and most importantly is the fact they have a lot of very valuable data they wouldn't want to siphon to a competitor. This data is a key strategic asset in the space where they do business.

Secondly though, I think it has to do with the fact Meta is big enough to worry about vertical integration and full control of their business.

gallerdude1mo ago

I’m sure there’s more to it than this, but it feels like Zuck has pet interests like VR and now AI.

alex11381mo ago

But no account support, that's boring

Or any quality control (people missing posts)

Or banning the people who should be banned while leaving everyone else alone

This is Zuck: https://news.ycombinator.com/item?id=4151433 or https://news.ycombinator.com/item?id=10791198

swyx1mo ago

oceansky1mo ago

He also tried and failed to buy Snapchat, and then copied their feature on all their big products: Instagram, Facebook and even WhatsApp.

prodigycorp1mo ago

The way you put it, I understand it less. lol

johanyc1mo ago

One word: control. It's the same reason Facebook became Meta

chairmansteve1mo ago

Pumps up the stock price.

awestroke1mo ago

Because Zuck has chronic FOMO, he's said as much himself

addandsubtract1mo ago

To download all those torrents, obviously.

zeroonetwothree1mo ago

But then how will Zuck win the billionaire dick measuring contest?

throwaw121mo ago

> I don’t think Zuck’s vision is particularly compelling.

But he has to do it anyways, otherwise Meta can be disrupted easily.

Google, Apple has hardware, distribution channels for their products

Amazon has the marketplace and cloud

Microsoft has enterprise and cloud

Meta is always looking for ways to stay afloat

xnx1mo ago

Meta has 3.5 billion daily active users

throwaw121mo ago

and has competitors like: TikTok, SnapChat, YouTube, Netflix, X, HBO, Amazon Prime, all fighting for the attention time.

They are worried something like Sora can disrupt them quickly

eranation1mo ago

levocardia1mo ago

hnav1mo ago

It's giving "OpenAI says its new model GPT-2 is too dangerous to release (2019)"

reducesuffering1mo ago

[because it would start an arms race]. The very arms race we're in... They were right

dbgrman1mo ago

Last i checked with friends at meta they are pretty deeply invested in using claude for coding etc. anthropic has nothing to be scared of at MSL.

If spark beats opus 4.6, why is meta wasting money on opus internally?

signatoremo1mo ago

13 days ago.

https://news.ycombinator.com/item?id=47538795

spindump89301mo ago

Yes, it's far more certain that meta released this, which is less convincing on evals, as a result of the mythos previews.

hvass1mo ago

Genuine question: Why release this the day after Mythos? It does not appear SOTA (just based on benchmarks). OpenAI will likely release Spud tomorrow.

paxys1mo ago

Mythos is a news article. This is an actual model you can use.

eranation1mo ago

Just a speculation, I have no real knowledge about it.

MattRix1mo ago

I think Anthropic did the mythos announcement to undercut OpenAI’s upcoming next model announcement, not Meta’s.

MattRix1mo ago

Why not? Not everything has to be SOTA to be interesting.

sidcool1mo ago

Will experiment with the model. But I am scared of sharing any information with the Zuck ecosystem.

GalaxyNova1mo ago

It is unfortunate that they decided to stop doing open-weight releases.

What could have been interesting has been reduced to simply another subpar LLM release.

toddmorey1mo ago

Question: since they've rebooted their approach to AI... have they given up on open models? There's no mention of open source or open weights or access to the models beyond their hosted services.

thegeomaster1mo ago

Alexandr Wang on Twitter [0] mentioned open source plans:

https://x.com/alexandr_wang/status/2041909388852748717

prodigycorp1mo ago

So the answer is: no. lol. Remember Llama 4 Behemoth, and how we were supposed to get more great models from it?

wmf1mo ago

This may be too large to run locally anyway. Maybe they will distill down some smaller open versions later.

gritspants1mo ago

I would like someone to tell me how stupid I am. If I were Meta/Zuck I'd open source a great model the moment my company developed it. This just looks like a pitch to investors, otherwise.

jamiequint1mo ago

"This just looks like a pitch to investors"

The goal of public companies is generally to generate profit for their investors.

samrus1mo ago

Im beginning to think thats the mantra we'll keep reciting as this whole country slowly falls apart

kzrdude1mo ago

pitch to investors sounds like working for the opposite goal though - to convince investors to give more money to the company.

SoftTalker1mo ago

This is also the goal of private companies.

gritspants1mo ago

Thank you for telling me how stupid I am.

khurdula1mo ago

"we hope to open-source future versions of the model."

Love to see it. Cheers!

edwcross1mo ago

What is the "BioTIER-refuse" thing mentioned in the "Bioweapons Refusal" graph?

I Googled it and found absolutely nothing.

Well, to be honest, I got 100% of websites containing the French word "boîtier" (box) with a typo.

Even on Google Scholar, the closest match is "BioTiER (Biological Training in Education and Research) Scholars Program", which is at least 10 years old and has nothing to do with that.

Is that an AI-generated image with an AI-generated name that has no physical existence?

EnderWT1mo ago

https://securebio.org/biotier/

binaryturtle1mo ago

Looks like it needs a meta account? As soon you hit enter it wants to log-in. I guess I won't try this any time soon. :)

zurfer1mo ago

> Muse Spark is available today at meta.ai and the Meta AI app. We’re opening a private API preview to select users.

m4r1k1mo ago

So no Open-weight .. why one would choose Muse Spark instead of Anthropic, OpenAI, or Google models all featuring from good to amazing harness?

ChrisArchitect1mo ago

Associated Meta news post with consumer-friendly takes: https://about.fb.com/news/2026/04/introducing-muse-spark-met...

spearman1mo ago

Uploading images requires logging in. Logging in is broken. It redirects to https://meta.ai/?error=Token%20exchange%20failed and doesn't show any error message. Impressive.

brianmcnulty1mo ago

visioninmyblood1mo ago

https://meta.ai/ this is where you can try it seems like the API is not publicly accessable yet. I feel they are very late to the game and do not show value to customers over other models.

p_stuart821mo ago

late isn't the problem. private preview api and no reason to switch. that's just another hosted model

cvhc1mo ago

Can't login. No error message in the UI. But the URL changes to "https://www.meta.ai/?error=Token%20exchange%20failed".

nh23423fefe1mo ago

same. closed tab and will forget to ever use it now

cvhc1mo ago

I switched to Chrome (from Firefox) and tried again. Now it's "https://www.meta.ai/?error=Invalid%20CSRF%20token" :facepalm:

nharada1mo ago

If Meta wants to be seen as a cutting edge massive lab they need to come across as one instead of looking like a school project version of a frontier model.

WarmWash1mo ago

Rumor on the ground is that they expected a much stronger model than this one.

levocardia1mo ago

nubg1mo ago

Can you elaborate?

WarmWash1mo ago

That's it. It's just a rumor. A model, which I don't even know of it's this one specifically, fell short of expectations. This rumor came up around mid March.

htrp1mo ago

llama4 behemoth problems?

KoolKat231mo ago

(I'm not using it as I'm not agreeing to their ad terms).

chankstein381mo ago

Personal Superintelligence made me think this was an open-source model being released and I was excited. Then I continued reading and I'll just wait until the model comes out.

dbgrman1mo ago

I wonder if Zuck will ever internalize that the words ‘personal’ and ‘meta’ will not be taken seriously together for another decade (if they don’t make another gaff).

gardnr1mo ago

I was really excited until I realised that “personal” meant “owned by meta“.

I’m trying to decide is I find the doublespeak a bit offensive or not.

eranation1mo ago

Sarcasm aside, tried it (with instant mode), it's an impressive model.

It nailed all the ChatGPT meme gotchas (walk to the carwash, Alice 50 brothers, upside down cup, R's in strawberry, which number is bigger, 9.11 or 9.9?)

I guess all that money poaching OpenAI / Anthropic talent went somewhere...

Now, would I use "Meta Muse Code" or "Muse CoWork" if I have to have a facebook account to all of my developers? Maybe not.

Would I use it via an API key? I might, depends on the pricing!

turtlesdown111mo ago

so since they hard programmed all of the meme gotchas, they built a good model?

nh23423fefe1mo ago

lazy snark < playing around with it

supermatt1mo ago

Does "personal" here mean "run the model on your personal hardware", or just "give your personal data to meta"?

anigbrowl1mo ago

supermdguy1mo ago

And also OpenAI’s codex spark?

maxaravind1mo ago

Personal superintelligence sounds nice until you actually try to use it.

We spent time yesterday arguing through an architecture decision. Today I ask the Agent to help implement it - it knows nothing about any of that. You’re effectively starting over.

Feels like the real problem isn’t intelligence, it’s continuity. And most benchmarks don’t even touch that.

manyminds1mo ago

khalic1mo ago

Oh good, if they built a lab, I’m sure they took the time the precisely define what they mean by super intelligence? Right? …

52-6F-621mo ago

If this is super intelligence, then it follows we must all be super-duper intelligence.

gardnr1mo ago

It’s personal…

Alifatisk1mo ago

Do we have any numbers on input, output and conversation context window limit?

rvz1mo ago

Until you actually try the model itself, assume any benchmark presented to you as being part of the marketing material of the model, as it is not independently verified and completely biased.

The same is true with any other model, unless otherwise stated.

In the next few days, we'll see who Meta has paid to promote this model on social media.

oliver2361mo ago

so glad its beating all the others on bioweapons refusal. this is what i most wanted out of the latest SOTA model

wmf1mo ago

Zuck has a lot more experience being summoned before Congress than you.

1 more reply

Artgor1mo ago

Also, I think people aren't used that using such models requires meta.ai or meta ai app.

conradkay1mo ago

It doesn't seem benchmaxxed, ARC AGI 2 score is quite bad (42.5%, GPT 5.4 is 76.1%) and coding is okay. But maybe this is the best Meta can do even benchmaxxing

The impressive part is multimodality, very plausible since there's less focus there by other labs (especially Anthropic)

solenoid09371mo ago

My Meta friends say it's benchmaxxed af

loeg1mo ago

We used to call this "overfitting," but I suppose everything has to be maxxed now. Fitmaxxed?

dbgrman1mo ago

Given llama 4 mucked up benchmark numbers, I’d take spark announcement with a many grains of salt.

RestlessAPI1mo ago

pixel_popping1mo ago

Meta back in the commercial race is actually exciting, despite not being a fan of the company.

syntaxing1mo ago

2001zhaozhao1mo ago

The "AIME Evolution" graph seems interesting. I wonder if other labs are doing this too to improve the reasoning performance of their models.

> Think longer to solve harder problems > Compress > Think longer again

adt1mo ago

Congrats to the Meta team on being model #800 on the Models Table, I suppose.

https://lifearchitect.ai/models-table/

spprashant1mo ago

Sounds like a good effort. They are choosing to focus on multi-modality - perhaps they are taking a different route here to Anthropic.

I don't like that I need to login to my FB/Instagram account to access this.

plombe1mo ago

Looks like a lightweight article. But memory usage went from 316MB -> 502 MB when I hit refresh. Not sure why? Any one have any ideas? Why does it need half a gig of ram in the first place?

btown1mo ago

Benchmarks are meaningless until the pelican benchmark comes out: https://simonwillison.net/

leumon1mo ago

pelican riding a bicycle (svg): https://files.catbox.moe/u5yc0x.png

santiagobasulto1mo ago

This looks like a very interesting model and very promising, especially after llama lost so much ground recently. I hope they release the weights

chrsw1mo ago

So Meta is not releasing open source models anymore?

johanyc1mo ago

They said they are in the tweet

chrsw1mo ago

Thanks. That's blocked for me.

try-working1mo ago

Looks alright for a "first" but there's no reason for anyone to really use until they open source it.

redlewel1mo ago

I doubt its better than Opus and even if it was its not worth the privacy concerns.

voidUpdate1mo ago

What makes this "superintelligence" instead of regular artificial intelligence?

leentee1mo ago

Look at their benchmark charts to understand how desperate they're. A lame duck now.

napolux1mo ago

I can't login. It sends me always the same code and it's not correct for them

nathan_compton1mo ago

Their product could literally teleport gold into my hands and I wouldn't use it.

LZ_Khan1mo ago

One word: distillation

vinni21mo ago

I have to create meta account to access. No thanks.

dbgrman1mo ago

Litmus test: what % of meta engineers are using muse vs Claude code? Last i heard it was mostly claude code. Tell you everything you need to know about how serious these benchmarks are.

upmind1mo ago

Sure it's not as good as Claude right now but for their first model in years it's certainly not bad. I hope they continue to develop models, having another competitor in the space would be nice.

dhruvyads1mo ago

Sad to see it's not going to be open source.

warthog1mo ago

Hoping the benchmarks are correct this time...

htrp1mo ago

Anyone done vibe testing at meta ai yet?

BugsJustFindMe1mo ago

victorbjorklund1mo ago

BugsJustFindMe1mo ago

> Competition often foster innovation.

ComputerGuru1mo ago

So does this confirm the end of llama?

jansport1231mo ago

did they just copy the chatgpt ui?

damian_pol1mo ago

nubg1mo ago

NOTHING about this is personal! No weights were released!

thebiggestloser1mo ago

why is it behind a login? Such bad UX.

senor261mo ago

who trusts meta on anything!!

Kuyawa1mo ago

> Meta AI isn't available yet in your country

Not my loss, will keep using DeepSeek then. Wake me up when my country is no longer in the wrong/right side of history.

OsrsNeedsf2P1mo ago

The only benchmark they show against SOTA models is in bioweapons refusal.

Edit: nvm I can't read, regular benchmarks against SOTA are there

sidcool1mo ago

Meta.ai has muse spark

ge961mo ago

funny how websites do that thing where it looks like you can use the product but soon as you hit enter, nope login first

RandyOrion1mo ago

No open weights.

Besides, I'm old enough to recall that META has trained a version of LLAMA 4 specifically for LM arena elo benchmaxxing and PR things, and proceeded to release a different version of LLAMA 4.

ehutch791mo ago

How's the metaverse doing? It was the next big thing and how we're all going to be working inside it in... was it like 3 months ago?

Maybe they need to mine more libra coin first? or is it diem now? is that even still part of meta?

I'm sure this new AI is super intelligent and super awesome and will be writing all the code, making all the blog posts, and generating all our youtube shorts in 6 months.

serf1mo ago

what's with the negativity?

The alternative is : no one does it and nothing gets abandoned, which the industry has shown itself to be exceedingly good at w.r.t VR for the past 40+ years.

ehutch791mo ago

It's not the failure here or there, it's a pattern. It's not even the failing, it's the excessive hype cycle.

Now they have a 'superintellegence'? Yeah, that sounds like just the latest in a line of bullshit. Why would this be different.

sva_1mo ago

> Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.

https://news.ycombinator.com/newsguidelines.html

ehutch791mo ago

Establishing a pattern of over hyping of projects that then disappear isn't a shallow dismissal.

captn3m01mo ago

Libra/Diem got sold to the bank they were partnering with (Silvergate) for $200M, which then filed for Bankruptcy.

https://en.wikipedia.org/wiki/Diem_(digital_currency)

1970-01-011mo ago

j / k navigate · click thread line to collapse