Ohm – A library and language for building parsers, interpreters, compilers, etc. (opens in new tab)

(github.com)

348 pointstesting_1_2_3_45y ago100 comments

100 comments

60 comments · 20 top-level

tovej5y ago· 15 in thread

Compiler compilers are great, I love writing DSLs for my projects. I usually use yacc/lex, or write my own compiler (typically in go these days).

However (and this is just me talking), I don't see the point in a javascript-based compiler. Surely any file format/DSL/programming language you write will be parsed server-side?

chrisseaton5y ago

> I don't see the point in a javascript-based compiler

JavaScript is a full programming language. Why wouldn't it be a fine choice to write a compiler in? People have a funny idea that compilers are more complex software or are somehow something low-level? In reality they're conceptually simple - as long as your language lets you write a function from one array of bytes to another array of bytes, then you can write a compiler in it. And for practicalities beyond that you just need basic records or objects or some other kind of structure, and you can have a pleasant experience writing a compiler.

> Surely any file format/DSL/programming language you write will be parsed server-side?

JavaScript can be used user-side, or anywhere else. It's just a regular programming language.

e12e5y ago

> I don't see the point in a javascript-based compiler

Typescript, sass, jsx... There are a lot of languages running on top of js. Or you might want to do colorizing, autoformating on input in the browser?

Along with all that, there's as mentioned nodejs, deno for running server side.

But at any rate - lots of front-end problems involve various kinds of parsing/validation and transformation (eg: processing.js).

centimeter5y ago

> Why wouldn't it be a fine choice to write a compiler in?

Javascript doesn't seem suited to compiler construction because it lacks lots of features that make compiler construction pleasant (e.g. strong rich types, algebraic data types, etc.)

It might be "fine" but it's not "good".

acarabott5y ago

I interned with the PI behind Ohm (Alex Warth) and one of his reasons for using the browser was simple:

“If I send someone an executable, they will never download it. If I send them a URL, they have no excuse.”

BiteCode_dev5y ago

We are talking about a compiler here.

If someone interested in a compiler doesn't download it, it's not a excuse, it's a filter. Or a warning sign.

4 more replies

hansvm5y ago

(also just me talking -- here are some potential counterpoints)

The choice of language often matters a lot less than how familiar you are with it (and its ecosystem(s)). I think it's totally reasonable to want to use JS for a compiler in, e.g., a Node project if for no other reason than to not have to learn too many extra things at once to be productive with the new tool.

I also don't think it's fair to assume everything will be parsed, tokenized, etc server-side. Even assuming that data originates server-side (since if it didn't you very well might have a compelling case for handling it client-side if for no other reason than latency), it's moderately popular nowadays to serve a basically static site describing a bunch of dynamic things for the frontend to do. Doing so can make it easier/cheaper to hit any given SLA at the cost of making your site unusable for underpowered clients and pushing those costs to your users, and that tradeoff isn't suitable everywhere, but it does exist.

It's interesting that you seem to implicitly assume the only reason somebody would choose JS is that they're writing frontend code. It's personally not my first choice for most things, but it's not too hard to imagine that some aspect of JS (e.g., npm) might make it a top contender for a particular project despite its other flaws and tradeoffs.

tenaciousDaniel5y ago

This makes me feel really good. I’m working on my first DSL and I’m writing it in JS. I really don’t know what I’m doing, and it felt like JS wasn’t as good a choice as a more “serious” language like C++.

But I’m standing my ground because I’m not even writing a proper “compiler” - in my case, the output is JSON. So it just kinda feels like it makes sense to stick with JS.

1 more reply

TheRealPomax5y ago

If your ecosystem is JS, having a JS based compiler is pretty convenient. As long as it's just "slower by some constant", rather than by a runtime order, the fact that it's not as fast as yacc/bison etc. is pretty much irrelevant, so being able to keep everything JS is quite powerful for people new to the idea having started their programming career using JS, as well as seasoned devs working in large JS codebases.

(and you can always decide that you need more speed - if you have a grammar defined, it's almost trivial to feed it to some other parser-generator)

coldtea5y ago

>Surely any file format/DSL/programming language you write will be parsed server-side?

Well, Javascript has been used for over a decade heavily on the server side, with Node, WASM and other projects.

And as far as raw speed goes, something like v8 smokes all scripting languages bar maybe LuaJit.

So, there's that...

peterhunt5y ago

There’s definitely a use for js based parsing for tooling that runs in the browser (autocomplete, documentation browsing etc). Integration with the Monaco editor is a common use case.

RodgerTheGreat5y ago

There's a great deal of value to making programming environments available in a browser, especially in the context of creative coding and education. I have built and used many such tools which are purely client-side.

There is a world of difference in accessibility between a tool that requires installation and a tool that you can use by following a hyperlink.

breck5y ago

> I don't see the point in a javascript-based compiler

My CC is Javascript based (well it was initially, then TypeScript, now a lot of it is written in itself).

99% of the time I use the actual languages I make in it server side (nodejs), but I am able to develop the languages in my browser using https://jtree.treenotation.org/designer/. It's super easy and fun (at least for me, UX sucks for most people at the moment). There's something somewhat magical about being able to tweak a language from my iPhone and then send the new lang to someone via text. (Warning: Designer is still hard to use and a big refresh is overdue).

iamwil5y ago

Wait, what do you use treenotation for? What are the languages for? I think I'm just a little surprised someone's using treenotation other than to play with it.

1 more reply

branneman5y ago

In that case, way I ask why you are not a Racket user? Sounds like it'll save you a ton of time and keep your implementations high level.

kesava5y ago

A ton of front end templating languages/frameworks. They involve compilers to different degrees, don't they?

dw-im-here5y ago· 9 in thread

I'd rather put my hand in boiling water than develop a compiler in a dynamic weak typed language.

dang5y ago

Please don't post unsubstantive and/or flamebait comments to HN. We don't want tedious flamewars here, including programming language flamewars that were tedious 20 years ago.

We detached this subthread from https://news.ycombinator.com/item?id=26604134.

chrisseaton5y ago

My experience doing both in practice is that the type system helps you with things that aren't really a problem anyway (a compiler doesn't really have complex data structures and you don't often get these basic things wrong) and all but the most sophisticated type systems don't even begin to help you with things you really need help with - maintaining invariants.

kazinator5y ago

Exactly. E.g. remove this line, and incorrect optimizations ensure, so the compiler fails halfway through recompiling itself:

http://www.kylheku.com/cgit/txr/tree/share/txr/stdlib/optimi...

The type is fine whether or not the line is present. It's all about that invariant.

None of the hair pulling I've experienced in compiler debugging had anything even remotely to do with type, which is something flushed out by testing.

Whenever doing anything, like an optimization test case, I put in print statements during development to see that it's being called, and what it's doing. You'd never add a new case into a compiler that you never tested. Just from the sheer psychology of it: too much work goes into it to then not bother running it. Plus the curiosity of seeing how often the case happens over a corpus of code.

pwdisswordfish65y ago

Write a compiler in a strongly typed language, and then remove all the type annotations. This may come as a shock, but this is what a compiler (or any codebase) could look like when developed in a weakly typed language.

rixed5y ago

> Write a compiler in a strongly typed language, and > then remove all the type annotations.

Help! That's what I did. I chose to write the compiler in OCaml, a language that's already ~30 years old by now. But I can not find any type anotations! What should I do? I'm stuck!

1 more reply

dw-im-here5y ago

You're confusing strong and static typing (javascript has neither). In more sophisticated languages such as scala, C# or haskell you can let the compiler infer the types for you, and you can then ask your IDE which type that is. This way you don't need to type out all the boilerplate, you get to see what a functions signature is, and you get compiler errors rather than runtime errors.

1 more reply

mintplant5y ago

That doesn't work if you're using types for anything beyond correctness-checking. Type-driven dispatch, for example, which tends to be used heavily in big compiler and interpreter projects. And tagged unions (or algebraic datatypes), a natural fit for representing ASTs, become more unwieldy without type-directed features like pattern matching.

3 more replies

rurban5y ago

Because you prefer to beaten by a stick after work, right? Helps your swollen hand.

Lisp is one of the best compiler implementation languages. Doing the same in C of C++ is about 3-20x more effort.

thermin5y ago

Extremely beyond the point, but it's not about Lisp, it's about automatic memory management, and to lesser extent lambdas and pattern matching.

There's nothing magical about Lisp that makes it super fit for compiler development.

1 more reply

corysama5y ago· 4 in thread

This is a follow-up to a major component of the http://vpri.org/writings.php project that created an self-contained office suite, OS and compiler suite in something like 100-200k lines of code without external dependencies.

infinite8s5y ago

They were trying for 10k lines of code(I think I saw Alan Kay mention online that they got to about 20k lines).

hobo_mark5y ago

Do you have a link to the project? I'm failing to find it on that page.

beagle35y ago

Not op, and can’t google now but the project was called STEPS, they did a down-to-metal os including network and GUI (and mote) in 20k lines.

Don’t remember anything about office suite. Related names I remember are Alan Kay, Dan Amelang, Alessandro Wirth and Ian Piumarta.

3 more replies

e12e5y ago

See:

https://en.m.wikipedia.org/wiki/Ometa (including reference section)

Or go to: http://www.vpri.org/writings.php

If I recall correctly you want: "STEPS Toward the Reinvention of Programming, 2012 Final Report Submitted to the National Science Foundation (NSF) October 2012" (and earlier reports)

Discussed on hn: https://news.ycombinator.com/item?id=11686325

And: https://news.ycombinator.com/item?id=585360

Notable for implementing tcp/ip by parsing the rfc.

"A Tiny TCP/IP Using Non-deterministic Parsing Principal Researcher: Ian Piumarta

For many reasons this has been on our list as a prime target for extreme reduction. (...) See Appendix E for a more complete explanation of how this “Tiny TCP” was realized in well under 200 lines of code, including the definitions of the languages for decoding header format and for controlling the flow of packets."

(...)

"Appendix E: Extended Example: A Tiny TCP/IP Done as a Parser (by Ian Piumarta) Elevating syntax to a 'first-class citizen' of the programmer's toolset suggests some unusually expres- sive alternatives to complex, repetitive, opaque and/or error-prone code. Network protocols are a per- fect example of the clumsiness of traditional programming languages obfuscating the simplicity of the protocols and the internal structure of the packets they exchange. We thought it would be instructive to see just how transparent we could make a simple TCP/IP implementation. Our first task is to describe the format of network packets. Perfectly good descriptions already exist in the various IETF Requests For Comments (RFCs) in the form of "ASCII-art diagrams". This form was probably chosen because the structure of a packet is immediately obvious just from glancing at the pictogram. For example:

  +-------------+-------------+-------------------------+----------+----------------------------------------+
  | 00 01 02 03 | 04 05 06 07 | 08 09 10 11 12 13 14 15 | 16 17 18 | 19 20 21 22 23 24 25 26 27 28 29 30 31 |
  +-------------+-------------+-------------------------+----------+----------------------------------------+
  |   version   |  headerSize |      typeOfService      |                     length                        |
  +-------------+-------------+-------------------------+----------+----------------------------------------+
  |                     identification                  |  flags   |                  offset                |
  +---------------------------+-------------------------+----------+----------------------------------------+
  |       timeToLive          |         protocol        |                    checksum                       |
  +---------------------------+-------------------------+---------------------------------------------------+
  |                                               sourceAddress                                             |
  +---------------------------------------------------------------------------------------------------------+
  |                                             destinationAddress                                          |
  +---------------------------------------------------------------------------------------------------------+

If we teach our programming language to recognize pictograms as definitions of accessors for bit fields within structures, our program is the clearest of its own meaning. The following expression cre- ates an IS grammar that describes ASCII art diagrams."

gklitt5y ago· 3 in thread

Ohm’s key selling point for me is the visual editor environment, which shows how the parser is executing on various sample inputs as you modify the grammar. It makes writing parsers fun rather than tedious. One of the best applications of “live programming” I’ve seen.

https://ohmlang.github.io/editor/

Waterluvian5y ago

A lot of regex testers do this and I can't imagine writing a regex or a parser without.

anon_tor_123455y ago

>a parser without

can you show me a parser generator that produces this kind of visualization?

2 more replies

thesz5y ago

I used to debug parsing process for VHDL grammar (which is ambiguous on lexem level) with parsing combinators and Haskell REPL.

Whenever my "compiler" found a syntax error in test suite, I was able to load part of source around error and investigate where my parser's error or omission is by running parser of smaller and smaller part of grammar on smaller and smaller parts of input.

It was 12 years ago.

And yes, it is fun. ;)

branneman5y ago· 3 in thread

When should one use Ohm over Racket?

coldtea5y ago

When they want a library and toolkit for building parsers and languages, rather than a general programming language based on Scheme.

dunefox5y ago

So, I guess you don't know why OP specifically asked about Racket: https://www.cs.utah.edu/plt/dagstuhl19/ https://beautifulracket.com/stacker/why-make-languages.html

1 more reply

branneman5y ago

... but racket basically exists to create parsers and languages. It happens to also be a general programming language. But so is JS nowadays with Node.

crazypython5y ago· 2 in thread

This title is misleading. It's a library and language for building parsers. Full stop. Parsing toolkit, as they say themselves.

exdsq5y ago

The title copies the second sentence of their readme:

> You can use it to parse custom file formats or quickly build parsers, interpreters, and compilers for programming languages.

UncleMeat5y ago

I guess it depends on what it means to somebody to build a compiler. Something like yacc says "compiler compiler" in the name but really it is a parser generator. The hard part of industrial compilers is the optimization.

tobr5y ago· 2 in thread

Speaking of - what’s the status of HARC? Is it defunct?

azeirah5y ago

Yep, HARC is no more. I don't recall the exact history but iirc SAP withdrew its funding and HARC basically ceased to exist.

Now, ohm survives as an open-source project, Bret Victor continues work with Dynamicland and Vi Hart is currently employed at Microsoft Research.

jagger275y ago

Defunct enough to let their TLS cert expire.

hardwaregeek5y ago· 1 in thread

I've used PEGs in the past. They're nice since they combine the mental model of LL grammars with the automation of LALR parser generators. However, it is quite easy to accidentally write rules where you never parse the second rule due to the ordering priority for rules. For instance:

    ident ::= name | name ("." name)+

Because with PEGs, the parser tries the first rule, then the second, and because whenever the second rule matches, the first one will also match, we will never parse the second rule. That's kinda annoying.

Of course with PEG tools you could probably solve this by computing the first sets for both rules and noticing that they're the same. Hopefully that's what this tool does.

sleavey5y ago

This is what's called left-recursion, and there's indeed a way to deal with it in PEG parsers: https://github.com/PhilippeSigaud/Pegged/wiki/Left-Recursion.

TheRealPomax5y ago· 1 in thread

It'd be cool if the online editor dispensed with the need to "write the grammar" entirely. A node based parser-generator in addition to Ohm being yet another grammar based parser-generator would be pretty great.

ampdepolymerase5y ago

Even better would be to generate parser from examples. See the Microsoft Research Excel Flash Fill paper.

joshmarinacci5y ago

I'm so happy to see this on HN. I've used Ohm for several projects. If you want a tutorial for building a simple programming language using Ohm, check out this series I put on GitHub.

https://github.com/joshmarinacci/meowlang

j0e15y ago

This is an example of a library we built using Ohm: https://github.com/Bridgeconn/usfm-grammar [1]

It works great for our use-case though I have been eyeing tree-sitter[2] for its ability to do partial parses.

[1] USFM: https://ubsicap.github.io/usfm/ [2] https://tree-sitter.github.io/tree-sitter/

PaulHoule5y ago

Each PEG generator promises a revolution but only burns a car.

I was disappointed with how they do operator precedence; they use the usual trick to make a PEG do operator precedence which looks cool when you apply it to two levels of precedence but if tried to implement C or Python in it it gets unwieldy. Most of your AST winds up being nodes that exist just to force precedence in your grammar, working with the AST is a mess.

For all the horrors of the Bell C compilers, having an explicit numeric precedence for operators was a feature in yacc that newer parser gens often don't have.

I worked out the math and it is totally possible to add a stage that adds the nodes to a PEG to make numeric precedence work and also delete the fake nodes from the parsed AST. Unparsing I'm not so sure of, since if someone wrote

   int a = (b + c);

how badly you want to keep the parens is up to you; a system like that MUST have an unparse-parse identity in terms of 'value of the expression', but for sw eng automation you want to keep the text of the source code as stable as you can.

fjfaase5y ago

I recently wrote a similar parser, maybe less fancy, for a workshop on parsing. It does display the the abstract syntax tree with d3.js and also has a build evaluator for a limited set of language constructs. https://fransfaase.github.io/ParserWorkshop/Online_inter_par... It is based on a parser I implemented in C++.

jweissman5y ago

I’ve built a number of toy language projects with Ohm and it’s really wonderful. Just a joy to use the visual tooling also. All around really beautiful machinery

recursivedoubts5y ago

Always fun to find the first commit:

https://github.com/harc/ohm/commit/4611bf63c5ecb90d782112d68...

2014

Neat tool. I write parsers by hand though. More fun, and you can be a lot sleazier.

scroot5y ago

We are using Ohmjs on a project at work and it is fantastic. I'm hoping one day that Ohmjs and Ohm/s (Squeak) can be compatible again -- would love to have the Smalltalk version of our interpreter and environment we built using this

pjmlp5y ago

Love it, this is great for teaching purposes.

codr75y ago

I recently created a library for the other part of an interpreter.

https://github.com/codr7/liblg

https://github.com/codr7/liblgpp

f4305y ago

If I want to modify GraphQL to support custom syntax, would Ohm work? Or does a solution exist already for my needs?

rkagerer5y ago

OHM is also the acronym for Open Hardware Monitor, a great open-source project for monitoring computer temperatures, fan speeds, voltages, etc: https://openhardwaremonitor.org/

j / k navigate · click thread line to collapse

100 comments

60 comments · 20 top-level

tovej5y ago· 15 in thread

Compiler compilers are great, I love writing DSLs for my projects. I usually use yacc/lex, or write my own compiler (typically in go these days).

However (and this is just me talking), I don't see the point in a javascript-based compiler. Surely any file format/DSL/programming language you write will be parsed server-side?

chrisseaton5y ago

> I don't see the point in a javascript-based compiler

> Surely any file format/DSL/programming language you write will be parsed server-side?

JavaScript can be used user-side, or anywhere else. It's just a regular programming language.

e12e5y ago

> I don't see the point in a javascript-based compiler

Typescript, sass, jsx... There are a lot of languages running on top of js. Or you might want to do colorizing, autoformating on input in the browser?

Along with all that, there's as mentioned nodejs, deno for running server side.

But at any rate - lots of front-end problems involve various kinds of parsing/validation and transformation (eg: processing.js).

centimeter5y ago

> Why wouldn't it be a fine choice to write a compiler in?

Javascript doesn't seem suited to compiler construction because it lacks lots of features that make compiler construction pleasant (e.g. strong rich types, algebraic data types, etc.)

It might be "fine" but it's not "good".

acarabott5y ago

I interned with the PI behind Ohm (Alex Warth) and one of his reasons for using the browser was simple:

“If I send someone an executable, they will never download it. If I send them a URL, they have no excuse.”

BiteCode_dev5y ago

We are talking about a compiler here.

If someone interested in a compiler doesn't download it, it's not a excuse, it's a filter. Or a warning sign.

4 more replies

hansvm5y ago

(also just me talking -- here are some potential counterpoints)

tenaciousDaniel5y ago

But I’m standing my ground because I’m not even writing a proper “compiler” - in my case, the output is JSON. So it just kinda feels like it makes sense to stick with JS.

1 more reply

TheRealPomax5y ago

(and you can always decide that you need more speed - if you have a grammar defined, it's almost trivial to feed it to some other parser-generator)

coldtea5y ago

>Surely any file format/DSL/programming language you write will be parsed server-side?

Well, Javascript has been used for over a decade heavily on the server side, with Node, WASM and other projects.

And as far as raw speed goes, something like v8 smokes all scripting languages bar maybe LuaJit.

So, there's that...

peterhunt5y ago

There’s definitely a use for js based parsing for tooling that runs in the browser (autocomplete, documentation browsing etc). Integration with the Monaco editor is a common use case.

RodgerTheGreat5y ago

There is a world of difference in accessibility between a tool that requires installation and a tool that you can use by following a hyperlink.

breck5y ago

> I don't see the point in a javascript-based compiler

My CC is Javascript based (well it was initially, then TypeScript, now a lot of it is written in itself).

iamwil5y ago

Wait, what do you use treenotation for? What are the languages for? I think I'm just a little surprised someone's using treenotation other than to play with it.

1 more reply

branneman5y ago

In that case, way I ask why you are not a Racket user? Sounds like it'll save you a ton of time and keep your implementations high level.

kesava5y ago

A ton of front end templating languages/frameworks. They involve compilers to different degrees, don't they?

dw-im-here5y ago· 9 in thread

I'd rather put my hand in boiling water than develop a compiler in a dynamic weak typed language.

dang5y ago

Please don't post unsubstantive and/or flamebait comments to HN. We don't want tedious flamewars here, including programming language flamewars that were tedious 20 years ago.

We detached this subthread from https://news.ycombinator.com/item?id=26604134.

chrisseaton5y ago

kazinator5y ago

Exactly. E.g. remove this line, and incorrect optimizations ensure, so the compiler fails halfway through recompiling itself:

http://www.kylheku.com/cgit/txr/tree/share/txr/stdlib/optimi...

The type is fine whether or not the line is present. It's all about that invariant.

None of the hair pulling I've experienced in compiler debugging had anything even remotely to do with type, which is something flushed out by testing.

pwdisswordfish65y ago

rixed5y ago

> Write a compiler in a strongly typed language, and > then remove all the type annotations.

Help! That's what I did. I chose to write the compiler in OCaml, a language that's already ~30 years old by now. But I can not find any type anotations! What should I do? I'm stuck!

1 more reply

dw-im-here5y ago

1 more reply

mintplant5y ago

3 more replies

rurban5y ago

Because you prefer to beaten by a stick after work, right? Helps your swollen hand.

Lisp is one of the best compiler implementation languages. Doing the same in C of C++ is about 3-20x more effort.

thermin5y ago

Extremely beyond the point, but it's not about Lisp, it's about automatic memory management, and to lesser extent lambdas and pattern matching.

There's nothing magical about Lisp that makes it super fit for compiler development.

1 more reply

corysama5y ago· 4 in thread

infinite8s5y ago

They were trying for 10k lines of code(I think I saw Alan Kay mention online that they got to about 20k lines).

hobo_mark5y ago

Do you have a link to the project? I'm failing to find it on that page.

beagle35y ago

Not op, and can’t google now but the project was called STEPS, they did a down-to-metal os including network and GUI (and mote) in 20k lines.

Don’t remember anything about office suite. Related names I remember are Alan Kay, Dan Amelang, Alessandro Wirth and Ian Piumarta.

3 more replies

e12e5y ago

See:

https://en.m.wikipedia.org/wiki/Ometa (including reference section)

Or go to: http://www.vpri.org/writings.php

If I recall correctly you want: "STEPS Toward the Reinvention of Programming, 2012 Final Report Submitted to the National Science Foundation (NSF) October 2012" (and earlier reports)

Discussed on hn: https://news.ycombinator.com/item?id=11686325

And: https://news.ycombinator.com/item?id=585360

Notable for implementing tcp/ip by parsing the rfc.

"A Tiny TCP/IP Using Non-deterministic Parsing Principal Researcher: Ian Piumarta

(...)

  +-------------+-------------+-------------------------+----------+----------------------------------------+
  | 00 01 02 03 | 04 05 06 07 | 08 09 10 11 12 13 14 15 | 16 17 18 | 19 20 21 22 23 24 25 26 27 28 29 30 31 |
  +-------------+-------------+-------------------------+----------+----------------------------------------+
  |   version   |  headerSize |      typeOfService      |                     length                        |
  +-------------+-------------+-------------------------+----------+----------------------------------------+
  |                     identification                  |  flags   |                  offset                |
  +---------------------------+-------------------------+----------+----------------------------------------+
  |       timeToLive          |         protocol        |                    checksum                       |
  +---------------------------+-------------------------+---------------------------------------------------+
  |                                               sourceAddress                                             |
  +---------------------------------------------------------------------------------------------------------+
  |                                             destinationAddress                                          |
  +---------------------------------------------------------------------------------------------------------+

gklitt5y ago· 3 in thread

https://ohmlang.github.io/editor/

Waterluvian5y ago

A lot of regex testers do this and I can't imagine writing a regex or a parser without.

anon_tor_123455y ago

>a parser without

can you show me a parser generator that produces this kind of visualization?

2 more replies

thesz5y ago

I used to debug parsing process for VHDL grammar (which is ambiguous on lexem level) with parsing combinators and Haskell REPL.

It was 12 years ago.

And yes, it is fun. ;)

branneman5y ago· 3 in thread

When should one use Ohm over Racket?

coldtea5y ago

When they want a library and toolkit for building parsers and languages, rather than a general programming language based on Scheme.

dunefox5y ago

So, I guess you don't know why OP specifically asked about Racket: https://www.cs.utah.edu/plt/dagstuhl19/ https://beautifulracket.com/stacker/why-make-languages.html

1 more reply

branneman5y ago

... but racket basically exists to create parsers and languages. It happens to also be a general programming language. But so is JS nowadays with Node.

crazypython5y ago· 2 in thread

This title is misleading. It's a library and language for building parsers. Full stop. Parsing toolkit, as they say themselves.

exdsq5y ago

The title copies the second sentence of their readme:

> You can use it to parse custom file formats or quickly build parsers, interpreters, and compilers for programming languages.

UncleMeat5y ago

tobr5y ago· 2 in thread

Speaking of - what’s the status of HARC? Is it defunct?

azeirah5y ago

Yep, HARC is no more. I don't recall the exact history but iirc SAP withdrew its funding and HARC basically ceased to exist.

Now, ohm survives as an open-source project, Bret Victor continues work with Dynamicland and Vi Hart is currently employed at Microsoft Research.

jagger275y ago

Defunct enough to let their TLS cert expire.

hardwaregeek5y ago· 1 in thread

    ident ::= name | name ("." name)+

Of course with PEG tools you could probably solve this by computing the first sets for both rules and noticing that they're the same. Hopefully that's what this tool does.

sleavey5y ago

This is what's called left-recursion, and there's indeed a way to deal with it in PEG parsers: https://github.com/PhilippeSigaud/Pegged/wiki/Left-Recursion.

TheRealPomax5y ago· 1 in thread

ampdepolymerase5y ago

Even better would be to generate parser from examples. See the Microsoft Research Excel Flash Fill paper.

joshmarinacci5y ago

I'm so happy to see this on HN. I've used Ohm for several projects. If you want a tutorial for building a simple programming language using Ohm, check out this series I put on GitHub.

https://github.com/joshmarinacci/meowlang

j0e15y ago

This is an example of a library we built using Ohm: https://github.com/Bridgeconn/usfm-grammar [1]

It works great for our use-case though I have been eyeing tree-sitter[2] for its ability to do partial parses.

[1] USFM: https://ubsicap.github.io/usfm/ [2] https://tree-sitter.github.io/tree-sitter/

PaulHoule5y ago

Each PEG generator promises a revolution but only burns a car.

For all the horrors of the Bell C compilers, having an explicit numeric precedence for operators was a feature in yacc that newer parser gens often don't have.

   int a = (b + c);

fjfaase5y ago

jweissman5y ago

I’ve built a number of toy language projects with Ohm and it’s really wonderful. Just a joy to use the visual tooling also. All around really beautiful machinery

recursivedoubts5y ago

Always fun to find the first commit:

https://github.com/harc/ohm/commit/4611bf63c5ecb90d782112d68...

2014

Neat tool. I write parsers by hand though. More fun, and you can be a lot sleazier.

scroot5y ago

pjmlp5y ago

Love it, this is great for teaching purposes.

codr75y ago

I recently created a library for the other part of an interpreter.

https://github.com/codr7/liblg

https://github.com/codr7/liblgpp

f4305y ago

If I want to modify GraphQL to support custom syntax, would Ohm work? Or does a solution exist already for my needs?

rkagerer5y ago

OHM is also the acronym for Open Hardware Monitor, a great open-source project for monitoring computer temperatures, fan speeds, voltages, etc: https://openhardwaremonitor.org/

j / k navigate · click thread line to collapse