undefined | Better HN

0 pointswrs11mo ago0 comments

A few weeks ago I gave an LLM (Gemini 2.5 something in Cursor) a bunch of examples of a new language, and asked it to write a recursive descent parser in Ruby. The language was nothing crazy, intentionally reminiscent of C/JS style, but certainly the exact definition was new. I didn’t want to use a parser generator because (a) I’d have to learn a new one for Ruby, and (b) I’ve always found it easier to generate useful error messages with a handwritten recursive descent parser.

IIRC, it went like this: I had it first write out the BNF based on the examples, and tweaked that a bit to match my intention. Then I had it write the lexer, and a bunch of tests for the lexer. I had it rewrite the lexer to use one big regex with named captures per token. Then I told it to write the parser. I told it to try again using a consistent style in the parser functions (when to do lookahead and how to do backtracking) and it rewrote it. I told it to write a bunch of parser tests, which I tweaked and refactored for readability (with LLM doing the grunt work). During this process it fixed most of its own bugs based on looking at failed tests.

Throughout this process I had to monitor every step and fix the occasional stupidity and wrong turn, but it felt like using a power tool, you just have to keep it aimed the right way so it does what you want.

The end result worked just fine, the code is quite readable and maintainable, and I’ve continued with that codebase since. That was a day of work that would have taken me more like a week without the LLM. And there is no parser generator I’m aware of that starts with examples rather than a grammar.

0 comments

Verdex11mo ago

Thanks for giving details about your workflow. At least for me it helps a lot in these sorts of discussions.

Although, it is interesting to me that the original posting mentioned LLMs "one-shot"ing parsers and this description sounds like a much more in depth process.

"And there is no parser generator [...] that starts with examples [...]"

People. People can generate parsers by starting with examples. Which, again, is more in line with the original "one-shot parsers" comment.

If people are finding LLMs useful as part of a process for parser generation then I'm glad. (And I mean testing parsers is pretty painful to me so I'm interested in the test case generation). However I'm much more interested in the existence or non-existent of one-shot parser generation.

steveklabnik11mo ago

I recently did something similar, but different: gave Claude some code examples of a Rust-like language, it wrote a recursive descent parser for me. That was a one-shot, though it's a very simple language.

After more features were added, I decided I wanted BNF for it, so it went and wrote it all out correctly, after the fact, from the parser implementation.

Verdex11mo ago

Can you give more info?

How big of a number is "some"?

Also what kind of prompts were you feeding it? Did you describe it as Rust like? Anything else you feel is relevant.

[Is there a GitHub link? I'm more than happy to do the detective work.]

steveklabnik11mo ago

Like three or four. very simple language: main function whos value is the error code, functions of one argument returning one value, only ints, basic control flow and math.

I just opened the repo, here's the commit that did what I'm talking about: https://github.com/steveklabnik/rue/commit/5742e7921f241368e...

Well, the second part anyway, with the grammar. It writing the lexer starts as https://github.com/steveklabnik/rue/commit/a9bce389ea358365f..., it was basically this program.

If I wrote down the prompts, I'd share them, but I didn't.

Please ignore the large amount of llm bullshit in here, since it was private while I did this, I wasn't really worried about how annoying and slightly wrong the README etc was. HEAD is better in that regard.

1 more reply

wrsOP11mo ago

I guess I don't really understand the goal of "one-shot" parser generation, since I can't even do that as a human using a parser generator! There's always an iterative process, as I find out how the language I wanted isn't quite the language I defined. Having somebody or something else write tests actually helps with that problem, as it'll exercise grammar cases outside my mental happy path.

Verdex11mo ago

The comment that started this whole thread off mentioned LLMs oneshot-ing parsers. I didn't think an LLM could one shot a parser and I am interested in parsers which is why I asked about more info.

It's not a goal of mine but because of interests in parsing I wanted to know if this was something that was happening or if it was hyperbole.

hombre_fatal10mo ago

By one-shot in my original post, I mean that it comes up with a decent, working implementation that I can then refine.

As opposed to getting there through incremental revisions where I must dictate the directions the LLM takes -- I don't necessary know the directions it should take because I might be agnostic about it.

I find success in breaking the problem down into tokenize and parse just like I do when writing parsers myself.

    tokenize(string) -> Token[]
    parse(Token[]) -> Node[]

And definitely get it to generate tests every step of the way. The insane part is when the LLM can run tests itself and can iterate on code until all tests pass.

Or you extend it with a feature just by writing some failing tests and it does the rest.

wrsOP11mo ago

Well, I mean, it sort of did one-shot the parser in my case (with a few bugs, of course). It just didn't one-shot the parser I wanted, largely because my definition was unclear. It would be interesting to see how it did if I went to the trouble of giving it a truly rigorous prompt.

j / k navigate · click thread line to collapse

0 comments

Verdex11mo ago

Thanks for giving details about your workflow. At least for me it helps a lot in these sorts of discussions.

Although, it is interesting to me that the original posting mentioned LLMs "one-shot"ing parsers and this description sounds like a much more in depth process.

"And there is no parser generator [...] that starts with examples [...]"

People. People can generate parsers by starting with examples. Which, again, is more in line with the original "one-shot parsers" comment.

steveklabnik11mo ago

After more features were added, I decided I wanted BNF for it, so it went and wrote it all out correctly, after the fact, from the parser implementation.

Verdex11mo ago

Can you give more info?

How big of a number is "some"?

Also what kind of prompts were you feeding it? Did you describe it as Rust like? Anything else you feel is relevant.

[Is there a GitHub link? I'm more than happy to do the detective work.]

steveklabnik11mo ago

Like three or four. very simple language: main function whos value is the error code, functions of one argument returning one value, only ints, basic control flow and math.

I just opened the repo, here's the commit that did what I'm talking about: https://github.com/steveklabnik/rue/commit/5742e7921f241368e...

Well, the second part anyway, with the grammar. It writing the lexer starts as https://github.com/steveklabnik/rue/commit/a9bce389ea358365f..., it was basically this program.

If I wrote down the prompts, I'd share them, but I didn't.

1 more reply

wrsOP11mo ago

Verdex11mo ago

The comment that started this whole thread off mentioned LLMs oneshot-ing parsers. I didn't think an LLM could one shot a parser and I am interested in parsers which is why I asked about more info.

It's not a goal of mine but because of interests in parsing I wanted to know if this was something that was happening or if it was hyperbole.

hombre_fatal10mo ago

By one-shot in my original post, I mean that it comes up with a decent, working implementation that I can then refine.

I find success in breaking the problem down into tokenize and parse just like I do when writing parsers myself.

    tokenize(string) -> Token[]
    parse(Token[]) -> Node[]

And definitely get it to generate tests every step of the way. The insane part is when the LLM can run tests itself and can iterate on code until all tests pass.

Or you extend it with a feature just by writing some failing tests and it does the rest.

wrsOP11mo ago

j / k navigate · click thread line to collapse