But to bring all of those things together and translate the concepts into working Python code is astonishing. We have just forgotten that a year ago, this achievement would have blown our minds.
I recently had to write an email to my kid’s school so that he could get some more support for a learning disability. I fed Claude 3 Opus a copy of his 35 page psychometric testing report along with a couple of his recent report cards and asked it to draft the email for me, making reference to things in the three documents provided. I also suggested it pay special attention to one of the testing results.
The first email draft was ready to send. Sure, I tweaked a thing or two, but this saved me half an hour of digging through dense material written by a psychologist. After verifying that there were no factual errors, I hit “Send.” To me, it’s still magic.
1. You're 100% right, there are privacy concerns.
2. I don't know if they could possibly be worse than the majority of school districts (including my kids) running directly off of Google's Education system (Chromebooks, Google Docs, Gmail etc.).
Could you enroll your child under a fake name? How messed up would they think that is :D
"We will not use your Inputs or Outputs to train our models, unless: (1) your conversations are flagged for Trust & Safety review (in which case we may use or analyze them to improve our ability to detect and enforce our Acceptable Use Policy, including training models for use by our Trust and Safety team, consistent with Anthropic’s safety mission), or (2) you’ve explicitly reported the materials to us (for example via our feedback mechanisms), or (3) by otherwise explicitly opting in to training."
As for defending against a data breach, Anthropic hired a former Google engineer, Jason Clinton, as CISO. I couldn't find much information about the relevant experience at Google that may have made him a good candidate for this role, but people with a key role in security at large organizations often don't advertise this fact on their LinkedIn profiles as it makes them a target. Once you're the CISO, the target appears, but that's what the big money is for.
They take whatever it spits out in the first attempt. And then they go on extrapolate this to draw all kinds of conclusions. They forget the output it generated is based on a random seed. A new attempt (with a new seed) is going to give a totally different answer.
If the author has retried that prompt, that new attempt might have generated better code or might have generated lot worse code. You can not draw conclusions from just one answer.
Once that oversight was pointed out, it did write a decent fuzzer that found the memory safety bugs in do_read and do_write. I also got it to fix those two bugs automatically (by providing it the ASAN output).
Totally different...I'd posit 5% different, and mostly in trivialities.
It's worth doing an experiment and prompting an LLM with a coding question twice, then seeing how different it is.
For, say, a K-Means clustering algorithm, you're absolutely correct. The initial state is _completely_ dependent on the choice of seed.
With LLMs, the initial state is your prompt + a seed. The prompt massively overwhelms the seed. Then, the nature of the model, predicting probabilities, then the nature of sampling, attempting to minimize surprise, means there's a powerful forcing function towards answers that share much in common. This is both in theory, and I think you'll see, in practice.
e.g LLM might have said for some reason the writing a fuzzer like this isn't possible and then went on presenting some alternatives for tge given task.
I have only experience with GPT-4 via api but I believe at core all these LLMs work the same way.
My pushback is limited to that the theoretical maximal degenerate behavior described in either of your comments is highly improbable in practice, with a lot of givens, such as reasonable parameters, reasonable model.
I.e. it will not
- give totally different answers due to seed changing.
- end up X% of the time, where X > 5 say it is impossible, and the other (100 - X)%, provide some solution.
I have integrated with GPT3.0/GPT3.5/GPT4 and revisions thereof via API, as well as Claude 2 and this week, Claude 3. I wrote a native inference solution that runs, among others, StableLM Zephyr 3B, Mistral 7B, and Mixtral 8x7B, and I wrote code that does inference, step by excruciating step, in a loop, on web via WASM, and via C++, tailored solutions for Android, iOS, macOS, Android, and Windows.
Logically LLMs should be quite good at creating the fuzzing data.
To state the obvious why, it's too expensive to use LLMs directly and this way works since they found "4 memory safety bugs and one hang"
But the future we are heading to should be LLMs will directly pentest/test the code. This is where it's interesting and new.
Then again, this whole approach to fuzzing comes across as kinda naive, at the very least you'd want to use an API of a coverage-guided fuzzer for generating the randomness (and then almost always fixing up CRC32 on top of that, like a human-written wrapper function would).
That way you get an efficient, robust coverage-driven fuzzing engine, rather than having an LLM try to reinvent the wheel on that part of the code poorly. Let the LLM help write the boilerplate code for you.
These are still pretty small scale experiments on essentially toy programs, so it remains to be seen if LLMs remain useful on real world programs, but so far it looks pretty promising – and it's a lot less work than writing a new libfuzzer target, especially when the program is one that's not set up with nice in-memory APIs (e.g., that GIF decoder program just uses read() calls distributed all over the program; it would be fairly painful to refactor it to play nicely with libfuzzer).
This approach could likely also be combined with RL; the code coverage provides a decent reward signal.
It's less academically pure, but as an engineer who wants to fix bugs it seems ok
Another test to check how much seeing the actual parser code helps is to have it generate a GIF fuzzer without giving it the code:
https://twitter.com/moyix/status/1766135426476064774
And finally, for fun, we can see how it does when we give it the RFC for GIF89a: