Why Concatenative Programming Matters (2012) (opens in new tab)

(evincarofautumn.blogspot.mx)

149 pointsjasondenizac13y ago36 comments

36 comments

33 comments · 11 top-level

Scaevolus13y ago· 15 in thread

The elegant minimalism of Forth is inspiring-- how many other languages can fit their REPL and development environment into a few kilobytes?

However, I find implicit arity to be the largest barrier in reading concatenative programs.

To understand what a line of Forth code does, you need to understand both what each function does, and how it manipulates the stack to accomplish it. That's more memory pressure than an imperative language like C, where the flow of data and computations is obvious. It's exacerbated by the tendency to factor a single function into many simpler ones.

In a team, the increased memory burden also requires more communication for shared understanding of code.

Many concatenative languages eventually grow support for local variables, since it's _incredibly_ awkward to implement certain algorithms when you have to do stack juggling to use a variable in a computation.

evincarofautumn13y ago

This is true, but it’s also important to realise that Forth is not the only direction. ;)

It’s been a while since I wrote this article, and there are a number of things it doesn’t address, such as the practical concerns you mention. It’s much the same issue as comes up when dealing with large assembly projects, or dynamically typed ones for that matter. Knowing what’s going on can be difficult to discern without a deep understanding of the overall system, and bugs have a tendency to hide in obscure places.

And speaking of assembly, Forth is essentially assembly code for a virtual stack machine. Or an actual stack machine.

I’ve been idly working on a statically typed concatenative language for a long time now, partly because I intend it to be taken seriously sometime in the next decade, but mostly as a learning experience. It’s designed to address many of these concerns—particularly, enabling programmers to focus on data flow by framing it as a functional language. (E.g.: don’t shuffle stacks—use combinators and/or locals.)

However, I continually run into problems with the type system. Typing of concatenative languages is not well researched and is surprisingly difficult. I’m not a type theorist by any means, but at this point I’m probably the world expert on concatenative type systems, simply because there are so few people working on the problem. If anyone would like to help out, I’d love to chat by email (username@gmail).

chipsy13y ago

How you looked at J. Paul Morrison's "Flow Based Programming" book? I recommend it since it touches upon a lot of similar topics. As your article points out, the concatenative style can be visualized as a flow; in the Morrison book flow is expanded upon with a more detailed, explicit connectivity system and an emphasis on visual language.

Regardless of syntax this is a very interesting language design space, since it seems to hold some of the most promise for lowering average software complexity.

qznc13y ago

Cat is concatenative and statically-typed. http://www.cat-language.com/

1 more reply

abecedarius13y ago

Did you know Robin Popplestone designed a type system for Forth? It involved what I think he called 'regular types' (like regular expressions), IIRC, to handle (disciplined use of) words like ?DUP.

jmmcd13y ago

It's a very interesting article. I wonder if you've heard of Push (a stack language) and PushGP (an evolutionary automatic programming system for which Push is designed)?

http://faculty.hampshire.edu/lspector/push.html

bun-neh13y ago

You should look in to Factor, a higher level concatenative language.. http://factorcode.org/

There is a package for lexical variables for those instances where stack shuffling is absolutely not the way to implement your algorithm. This page http://docs.factorcode.org/content/article-locals-examples.h... compares the stack shuffling vs lexical variable approaches. Oh, and those lexical variable have essentially no computational overhead.

I disagree that the tendency to factor a single function into many simpler ones exacerbates the understandability of a function. If done properly you get fantastically readable functions, such as this gem from https://github.com/JohnEarnest/Mako/blob/master/games/Deep/D... :

    move-player fire waves bounce think draw-score spawn-crab storm

This snippet of a high level function is self describing. 1) Check to see if movement keyboard entry is pressed and move the player 2) check to see bomb keyboard entry is pressed and move an existing bomb 3) simulate the ocean waves etc...

Factoring gives you a chance to mitigate complexities in stack shuffling and allows you to build a semantically meaningful representation of your problem. Concatenative languages lend themselves to creating DSLs; every solution you write should be a small DSL for that particular problem or subproblem.

kenko13y ago

This snippet reads to me as if it says:

(1) move the player unconditionally (2) do something involve fire, or perhaps involving firing, unconditionally (3) maybe the player waves? maybe there are waves? (4) ??? (5) ??? (6) update the score (7) create a new crab unconditionally (8) storm the fortress? there is a storm?

Nothing is self-describing absent context. (I have sufficient context for "draw-score" and "spawn-crab" that I don't think they mean, say, "check scores to see if there's a draw" and "create a crab of the 'spawn' type".)

Someone13y ago

"This snippet of a high level function is self describing. 1) Check to see if movement keyboard entry is pressed and move the player"

I disagree. I do not see anything there that can even remotely be explained to imply the 'check to see if movement keyboard entry is pressed' part. Something like 'handle-player-movement' would be better.

Similarly, 'fire' could fire a gun, simulate fire in the way 'waves' (apparently) simulates waves, etc.

raganwald13y ago

It's exacerbated by the tendency to factor a single function into many simpler ones.

That's an interesting comment. I often wonder about where the "sweet spot" of expression lies. If you break things up into too many small things, it's confusing. But a huge monolithic entity is also confusing. What's the "right level of decomposition?"

jacquesm13y ago

I think this is closely related to why 'naming things' is hard. Small functions do one thing and are easy to name, larger things do more than one thing and are harder to name. But very small functions, smaller than useful leads to harder names as well!

This does not directly answer your question but I think a good sign that you're on the wrong track is when naming things gets harder, that's when you've decomposed too far.

2 more replies

fizx13y ago

I've always thought of a hyper-abstracted codebase being like a binary tree, lazy code being like a linked list, and optimal code being like a B-tree tuned for the page size and seek latency of your particular brain.

chongli13y ago

>If you break things up into too many small things, it's confusing.

Not if you're disciplined about it.

>What's the "right level of decomposition?"

The key is not to decompose your whole problem into a network of tightly coupled parts. It's far better to build up your language with layers of abstraction until you have a new language which makes your problem trivial to express.

protomyth13y ago

I think a decent IDE with some preprocessing of words might solve the Forth-family stack problem. A lot of Forth words are deterministic about how they use the stack (no looping). It would be interesting to have an IDE show a stack usage diagram for each word. If it cannot determine than take it from the comment or put up a ?.

Postscript has support for local variables and I have found it to be a nicer language than most Forths.

disclaimer: I can read Forth / Postscript easily, but for some reason Lisp always messes with me. I also don't see much of a difference between Forth words, API calls, and domain specific languages. They are all of a piece.

GhotiFish13y ago

>To understand what a line of Forth code does, you need to understand both what each function does, and how it manipulates the stack to accomplish it.

I don't understand this. I don't need to know how add works to know it pops two bytes off the stack and pushes them added together. Likewise for my own functions. I don't need to know how they do it, just what they do.

eurleif13y ago

I think the point was that you can't tell if `foo bar` is equivalent to `bar(foo())` or `bar(foo(value_from_stack))` or `bar(foo(val1, val2))` or `foo(); bar()` or what unless you know the stack effect of the words.

michaelfeathers13y ago· 2 in thread

The only thing that bothers me about the concatenative languages I've seen is the stack. It is a global variable. You don't have the bounding (this is your part of the stack) that you have in languages that have an enforced frame for procedure/method calls.

Sometimes I wonder what a concatenative language would look like if operations could only access things given them by the immediately previous operation and if those things were read-only.

abecedarius13y ago

One thing it could look like is Backus's FP. I made a concatenative variant of it back in the 80s, and just rewrote the essentials this afternoon: https://github.com/darius/peglet/blob/master/examples/fp.py

In my earlier dialect I added a form of local variables, just enough for better readability without really changing the semantics.

mwotton13y ago

Row polymorphism has something to say about this, doesn't it? Coming from Haskell, I'm still finding concatenative programming hard to handle.

saljam13y ago· 1 in thread

Forth is brilliant, but my favourite concatenative language is PostScript. Most view PostScript as nothing more than a dated binary format for vector graphics. However, writing plots and diagrams in pure PS is incredibly flexible and pretty easy once you build up or find enough libraries to do what you want. Postscriptbarcode[0] is an example of a good library. [1] is a good guide to writing PS by hand.

What is PS does lack, however, is a modern debugger (or interpreter with decent error reporting) and package management.

[0] http://code.google.com/p/postscriptbarcode/ [1] http://www.math.ubc.ca/~cass/graphics/manual/

colanderman13y ago

I went on a PostScript kick a few years back. It's amazing what you can tell printers to do: http://fstutoring.com/~chris/postscript/

antirez13y ago· 1 in thread

One thing I did in the past with concatenative programming that was very fun to work with, was an engine of genetic programming. Among the other features of concatenative programming languages there is that it is trivial to create a random program that is valid, and also to mix different programs together. Not that doing genetic programming with s-expressions is so hard, but it is definitely possible to create very simple and fast implementations with CP.

gnosis13y ago

You might be interested in taking a look at PushGP:

http://faculty.hampshire.edu/lspector/push.html

tmcw13y ago· 1 in thread

This would be much better reordered. The lead 'why concatenative programming matters' is only answered halfway through, after the reader is expected to trudge through lambda notation. Why it matters: it's the basis for interpreters of languages you use. Then explain why, then explain what it is.

mjn13y ago

For some audiences I'd agree, but I think the current presentation works well if the target audience is people who already believe that functional programming matters, but are skeptical that concatenative programming is an interesting or important concept. For that audience I think the presentation starting from the familiar type-signature notation is effective.

AlexanderDhoore13y ago· 1 in thread

I especially liked the comparison with unix pipes. Whereas Powershell is kind of an attempt to make command line scripting OOP. This could be applied to create the next generation of unix shells...

I'll have to think about this.

beatgammit13y ago

Hmm, interesting. I've been thinking about a Haskell-based shell, but the flow of concatenative programming might be a better fit. Maybe I'll have to play with this idea this summer as well...

elliotlai13y ago· 1 in thread

You meant Arrows http://www.haskell.org/arrows/

evincarofautumn13y ago

I might be wrong, but I think a concatenative language can’t be represented using only arrows, because arrows aren’t sufficient to express dynamic function application. You can think of monads as arrows with the “bind” function of a type like (a × (a → b) → b). That’s precisely the type of the common concatenative “apply” combinator that lets you do anything useful. The lifting of application is what allows you to treat terms (objects) as actions (morphisms). Without it or an equivalent combinator, you can’t write context-sensitive computations or manipulate concatenative quotations in any meaningful way.

snprbob8613y ago

Warning: Shameless self promotion

I implemented a concatenative DSL for Clojure called Factjor [1] inspired by Factor [2]. Clojure is to Lisp what Factor is to Forth. I used Factjor to implement DomScript [3], which is sorta like jQuery re-imagined as a concatenative language like PostScript. I also gave a talk on both Factjor and DomScript at Clojure/West. Slides are available now [4] and a video will be available on InfoQ soon.

[1] https://github.com/brandonbloom/factjor

[2] http://factorcode.org/

[3] https://github.com/brandonbloom/domscript

[4] https://github.com/strangeloop/clojurewest2013/raw/master/sl...

sparist13y ago

Very important article! Preach it.

I remember discovering the Joy language (and combinatory logic) and being blown away by the elegance of being able to express programs just by composing functions. It was a pretty big epiphany to learn that when functions pop arguments from and push results onto a stack, the "application of a function to a data value" could be treated equivalently to "composition of a function with a 'constant' function that pushes a data value onto a stack". That led to the subsequent discovery that the data stack is extra "scaffolding" that can be removed: using prefix notation instead of postfix allows the program to be represented in memory as an actual composition of partially applied functions, each of which takes the remainder of the program as an argument and returns an output program. This led to the creation of Om [1], which I believe is the "most concatenative" language for this reason.

[1] http://om-language.org

smosher13y ago

This is impossible to do with any other type of language. With concatenative programming, a parallel compiler is a plain old map-reduce!

This statement (in context) draws a defining line around what is and isn't concatenative, and the statement by Norman Ramsey should have been refuted succinctly with a derived definition: Composable expressions with associativity.

SagelyGuru13y ago

Good stuff. I never thought that the old POP-2 ideas would become fashionable again ;)

j / k navigate · click thread line to collapse

36 comments

33 comments · 11 top-level

Scaevolus13y ago· 15 in thread

The elegant minimalism of Forth is inspiring-- how many other languages can fit their REPL and development environment into a few kilobytes?

However, I find implicit arity to be the largest barrier in reading concatenative programs.

In a team, the increased memory burden also requires more communication for shared understanding of code.

evincarofautumn13y ago

This is true, but it’s also important to realise that Forth is not the only direction. ;)

And speaking of assembly, Forth is essentially assembly code for a virtual stack machine. Or an actual stack machine.

chipsy13y ago

Regardless of syntax this is a very interesting language design space, since it seems to hold some of the most promise for lowering average software complexity.

qznc13y ago

Cat is concatenative and statically-typed. http://www.cat-language.com/

1 more reply

abecedarius13y ago

Did you know Robin Popplestone designed a type system for Forth? It involved what I think he called 'regular types' (like regular expressions), IIRC, to handle (disciplined use of) words like ?DUP.

jmmcd13y ago

It's a very interesting article. I wonder if you've heard of Push (a stack language) and PushGP (an evolutionary automatic programming system for which Push is designed)?

http://faculty.hampshire.edu/lspector/push.html

bun-neh13y ago

You should look in to Factor, a higher level concatenative language.. http://factorcode.org/

    move-player fire waves bounce think draw-score spawn-crab storm

kenko13y ago

This snippet reads to me as if it says:

Someone13y ago

"This snippet of a high level function is self describing. 1) Check to see if movement keyboard entry is pressed and move the player"

Similarly, 'fire' could fire a gun, simulate fire in the way 'waves' (apparently) simulates waves, etc.

raganwald13y ago

It's exacerbated by the tendency to factor a single function into many simpler ones.

jacquesm13y ago

This does not directly answer your question but I think a good sign that you're on the wrong track is when naming things gets harder, that's when you've decomposed too far.

2 more replies

fizx13y ago

chongli13y ago

>If you break things up into too many small things, it's confusing.

Not if you're disciplined about it.

>What's the "right level of decomposition?"

protomyth13y ago

Postscript has support for local variables and I have found it to be a nicer language than most Forths.

GhotiFish13y ago

>To understand what a line of Forth code does, you need to understand both what each function does, and how it manipulates the stack to accomplish it.

eurleif13y ago

michaelfeathers13y ago· 2 in thread

Sometimes I wonder what a concatenative language would look like if operations could only access things given them by the immediately previous operation and if those things were read-only.

abecedarius13y ago

In my earlier dialect I added a form of local variables, just enough for better readability without really changing the semantics.

mwotton13y ago

Row polymorphism has something to say about this, doesn't it? Coming from Haskell, I'm still finding concatenative programming hard to handle.

saljam13y ago· 1 in thread

What is PS does lack, however, is a modern debugger (or interpreter with decent error reporting) and package management.

[0] http://code.google.com/p/postscriptbarcode/ [1] http://www.math.ubc.ca/~cass/graphics/manual/

colanderman13y ago

I went on a PostScript kick a few years back. It's amazing what you can tell printers to do: http://fstutoring.com/~chris/postscript/

antirez13y ago· 1 in thread

gnosis13y ago

You might be interested in taking a look at PushGP:

http://faculty.hampshire.edu/lspector/push.html

tmcw13y ago· 1 in thread

mjn13y ago

AlexanderDhoore13y ago· 1 in thread

I especially liked the comparison with unix pipes. Whereas Powershell is kind of an attempt to make command line scripting OOP. This could be applied to create the next generation of unix shells...

I'll have to think about this.

beatgammit13y ago

Hmm, interesting. I've been thinking about a Haskell-based shell, but the flow of concatenative programming might be a better fit. Maybe I'll have to play with this idea this summer as well...

elliotlai13y ago· 1 in thread

You meant Arrows http://www.haskell.org/arrows/

evincarofautumn13y ago

snprbob8613y ago

Warning: Shameless self promotion

[1] https://github.com/brandonbloom/factjor

[2] http://factorcode.org/

[3] https://github.com/brandonbloom/domscript

[4] https://github.com/strangeloop/clojurewest2013/raw/master/sl...

sparist13y ago

Very important article! Preach it.

[1] http://om-language.org

smosher13y ago

This is impossible to do with any other type of language. With concatenative programming, a parallel compiler is a plain old map-reduce!

SagelyGuru13y ago

Good stuff. I never thought that the old POP-2 ideas would become fashionable again ;)

j / k navigate · click thread line to collapse