A bug is a bug. A “potential vulnerability” is a bug. A vulnerability is verifiable as having security implications with a proof of concept or other substantial evidence.
Words matter. Bugs matter. It’s important to fix large amounts of bugs, just as it always has been, and has been done. Let that be impressive on its own, because it IS impressive.
Mythos didn’t write 271 PoC for vulnerabilities and demonstrate code path reachability with security implications. Mythos found 271 valid bugs. Let that be enough.
It's better because it actually lists a sample of Bugzilla reports that were made public. This topic was discussed previously (36 comments two weeks ago: https://news.ycombinator.com/item?id=47885042), but the part about bug reports being made public is brand new.
This isn’t sarcasm. Firefox deserves to be used more. Most people I know don’t use it because “Chrome does almost everything better”, and Firefox can’t compete with the other browsers’ roadmaps.
Firefox is written in several languages, only about 25% of it is in C++ but every single one of these issues seems to touch the C++.
I eventually left and wound up at Mozilla where there were a number of /* flawfinder ignore */ comments scattered throughout the code.
My guess is that Mythos just ignored the "flawfinder ignore" directives and reported the known vulnerabilities in the code.
I wonder if these models will get good + cheap enough so that people rarely reach for static analysis.
From what I understand, that is a recipe for getting quickly banned by commercial LLM providers?
Wired: Mozilla Used Anthropic's Mythos to Find and Fix 271 Bugs in Firefox (41 points, 18 comments) https://news.ycombinator.com/item?id=47853649
Ars: Mozilla: Anthropic's Mythos found 271 security vulnerabilities in Firefox 150 (33 points, 8 comments)https://news.ycombinator.com/item?id=47855384
I don't understand much of this paragraph:
* "a crank they can pull that says: ‘Yep, this has the problem,’": as in, ring an alarm? Does the LLM ring th alarm?
* "you can iterate on the code and know clearly when you’ve fixed it": Isn't that true of most bugs, assuming you do the normal thing and generate a test case? And I thought the LLM output test cases itself: "It will craft test cases. We have our existing fuzzing systems and tools to be able to run those tests" And are they claiming the LLM facilitates iterating?
* "and eventually land the test case in the tree": Don't you create the test case before the fix? And just a few words earlier they seemed to be working on the fix, not the test case. And see the prior point about test cases.
* "such that you don’t regress it.”: How is the LLM helping here?
Maybe I'm missing some fundamental unwritten assumption?
The zero-days are numbered
New tools find new bugs, but the oligarchy newspapers report on Mythos and not on clang-22.0.
I'm not only talking about big things like
* Pocket,
* several major UI redesigns and
* the offline translations,
but even tiny useless things like
* browser.urlbar.trimURLs,
* putting the search query in the URL bar instead of the URL after searching from the URL bar,
* messing with the Edit and Resend feature for no reason (the good one that updates the content length is still available at devtools.netmonitor.features.newEditAndResend) and
* probably thousands of little shit like this that took a bunch of developer hours to implement.
All of the above should've been add-ons.
And of course, we know Mozilla spends a lot of money on things unrelated to Firefox at all. It's amazing Firefox is somewhat secure and stable compared to Chrome, which is backed by Google with their infinitely deep pockets.
This is a web browser, after all. Something most people use all the time. Something that accepts untrusted input from thousands of sources every day. People use it pretty much every aspect of their lives - banking, personal communication, porn, expressing political opinions. It's used for viewing PDFs, playing media files, for interacting with a whole bunch of APIs (that IMO shouldn't be part of the web, but they are). Security should be top priority.