Furthermore, the "Core" language is, I believe, for many purposes, more complicated than necessary (e.g., it is typed). It would be nice if there were a simpler alternative, e.g., for small projects.
Of course, I could be wrong about this (perhaps I looked in the wrong places), but this is just what I noticed.
Besides this, of course, Haskell is a cool language, and GHC an awesome compiler! :)
The 1992 SPJ paper http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=70D... (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.53.3...)
https://www.google.fr/search?q=spj+ghc+stg -> http://lambda.jstolarek.com/2013/06/getting-friendly-with-st...
http://research.microsoft.com/en-us/um/people/simonpj/papers...
An untyped representation is "just code", and that way lies madness. Bugs at least.
We don't really emphasize or support people using the intermediate language in an external way very well. They're parts of the compiler that are intimately and deeply tied to other internals elsewhere, and it's not really an explicit design goal that those components be reusable in a general way.
Now, people do use GHC for this (years ago I even helped write a compiler that transformed GHC's core language into a whole-program IR that was compiled to C), and we do have an API you can use to leverage the compiler, but overall the amount of people using it for things like this, vs using it for things like dynamic loaders or typechecking utilities (such as ghc-mod) are very small.
In fact, just last year we removed 'External Core' from GHC, which was a way of serializing the Core representation to disk. Why? Because it was actually broken for close to two years I think, and nobody ever really complained or fixed or wanted to support it! And after a discussion, we didn't want to support it either. It has been used, but when it bitrots that bad, I think it's clear this isn't one of the largest driving use cases for the compiler.
That said, we could improve the documentation a lot, and add a lot more examples. But I don't think there ever has been (or probably will be) a huge push to make the core IRs reusable in an easy way. It's simply not a high priority design goal.
> Furthermore, the "Core" language is, I believe, for many purposes, more complicated than necessary (e.g., it is typed). It would be nice if there were a simpler alternative, e.g., for small projects.
The Core language being typed is a good thing for GHC - it makes it easy to turn on an internal typechecker and determine if the compiler has produced invalid IR, your optimization pass did, etc. It adds a lot of safety to your produced programs when you can ensure they type-check in a sane way.
This component, along with the core linting passes, have likely caught an _innumerable_ amount of optimization and desugaring bugs over the years, while being extremely low cost to support and use. Overall, typed Core has been a very huge win for GHC, and I don't ever see us adopting a different route. In fact, it's planned for our Core language to get an even fancier (dependent) type system not far off in the future. :)
I don't think an external format (serialized Core) is even necessary. Just a well-documented API would be nice. I'm hoping that somebody could write a set of examples, and put those in a test-suite so that it keeps getting updated whenever something breaks it.
I also agree that typing is extremely useful. But for small research projects, it can get in the way of the actual goal.
While I think real understanding will come if/when I have to actually wrestle with the lower layers of GHC (or am contributing to the source code or something), this served as a great overview, and a quick explanation of how Haskell "really" works.
An introduction to GHC which I found helpful is its AOSA chapter: http://www.aosabook.org/en/ghc.html