What you described is flatMap in Java, or join() in other languages. Somebody above went into the CPS transform of a program. I'm so lost at how simple this is versus how complex all the descriptions are.
Let's say you need a function that maintains internal state. Now, in non-pure languages, you could that by simply referring to some global variable, but let's agree that's bad. So how can you model internal state purely?
Simple, your function takes the current state and produces a new state, along with its result. So, a pure function like:
f(String) => Int
turns into the stateful function: f(String) => (State => [Int, State])
I.e. a function that returns a function that takes in a state and produces a result and a new state.Cool, but now let's say we have another function g:
g(Int) => (State => [Double, State])
Let's now say we want to compose those two functions: let result = g(f("5"));
However, this won't work, becase f produces (State => [Int, State]) but g expects Int.So, here's bind for the State monad (implementation is not important):
bind(State => [a, State], a => (State => [b, State])) => (State => [b, State])
If we define a "type alias" (just a synonym, so we don't have to type that much) for a stateful function: type StatefulFunction<a> = State => [a, State]
and then replace the above bind definition with it: bind(StatefulFunction<a>, a => StatefulFunction<b>) => StatefulFunction<b>
you can see that is has the same structure as flatMap! flatMap(List<a>, a => List<b>) => List<b>
With the type alias, f and g would then look like this: f(String) => StatefulFunction<Int>
f(Int) => StatefulFunction<Double>
So now, here's how you compose f and g: bind(f("5"), a => g(a))
In Haskell, there's syntax sugar for bind, so the above turns into do
a <- f "5"
g a
which is very similar to a conventional language: {
a = f(5);
g(a);
}
EDIT: forgot result type in bind and flatMap.> flatMap(List<a>, a => List<b>) => List<b>
but that isn't the same as bind, since bind took two functions and flatmap takes a list and a function.
However, I do kind of understand. Compare this to the (currently) 3 responses below that quickly turn into text mush simultaneously treating me like an idiot who has never understand abstraction before or treating me like I'm been programming Haskell for a decade.
At some point I'm just going to cut my losses and ignore Haskell. It isn't the language, but the less that elucidating community support.
bind(StatefulFunction<a>, a => StatefulFunction<b>) => StatefulFunction<b>
flatMap(List<a>, a => List<b>) => List<b>
Look at the structure. Don't pay attention to what List or StatefulFunction actually are, that's not important here. The thing to focus on is that if you can provide this kind of general function for something, then that something is probably a monad.I.e. a function that takes in Something<a> and a function from a to Something<b> and produces Something<b>.
So why is this pattern so useful that everybody freaks out about it?
Well, that syntax sugar I showed you at the end, let's look at it again:
do
a <- f
b <- g a
h b
The general idea here is that "a <- f" is translated into bind(f, a => ...). The whole block would turn into bind(f, a => bind(g(a), b => h(b)));
Bear with me. Now, let's look at one more example: bind(Optional<a>, a => Optional<b>) => Optional<b>
Optional<a> is just a data type that maybe contains an <a>. Think of it as a safe replacement for null. So, bind for Optional looks at the first argumenta and passes it on to the function and returns its result, if the Optional is not empty. If it is, bind also returns an empty Optional.Now let' say we have a bunch of Optional producing functions:
getUser(userId) => Optional<User> // user may not exist
getAddress(User) => Optional<Address> // user may not have entered an address
getStreet(Address) => Optional<Street> // address may be weird
Now ok, what would we do in a language without monads? user = getUser(userId);
if (user.hasValue()) {
address = getAddress(user.getValue());
if (address.hasValue()) {
street = getStreet(address);
// do something
}
}
Well, that's not great. But because Optional is a monad and we have that nifty bind sugar, that's how that code would look like in Haskell: do
user <- getUser userId
address <- getAddress user
street <- getStreet address
Much better.More generally, it turns out monads are literally everywhere. The guy that talked about CPS transformed programs being monads is right - that's the continuation monad.
Let's look at that. If you know node.js, you know that its API is basically asynchronous:
fs.open(path, callback)
fs.read(fd, ..., callback)
fs.close(fs, callback)
Now, let's read some file in node: fs.open(path, function(fd) {
fs.read(fd, ..., function(result) {
// do something with result
fs.close(fs, function() {
// file closed
}
}
}
Well, that's ridiculous. But becase we have the continuation monad, we can write exactly the same thing in Haskell: do
fd <- open path
result <- read fd
// do something
close fd
So why are monads so useful? Because they are so general. They can be used to abstract over many many things that exhibit the basic pattern (S<a>, a => S<b>) => S<b>. Not to mention that when you write a function to generally work with monads, it will work for ANY monad. List, State, Maybe, Cont - doesn't matter.There's not much more to it than recognizing that pattern. That's it. Once you see that that's literally it you'll start seeing monads everywhere :)
Instead of Stone Age ideas like global variables, you could create a Lisp or Javascript closure, or use a C++ functino. Do monads offer a big enough advantage over those approaches to justify the endless trouble of learning them?
;; bad
(let ((s (make-state-thing)))
(defun function () ...)))
;; good
(defparameter *s* (make-state-thing))
(defun function () ...)
That defparameter might be defvar: it depends on whether you want to re-initialize the state variable when a new version of this module is loaded, or keep the old value.Speaking of which, with the let, you don't get that decision. For the defun to have its effect, the form has to be evaluated. Every evaluation of the let instantiates a new binding for s in a new lexical environment, end of story.
Even if we have the defparameter, we can redefine function while keeping the same state.
It's also easy to inspect the state. A the REPL, just evaluate its name: there it is. You don't have to whip out a closure environment inspector.
defun is a "stone age" tool anyway, which destructively overwrites a global function binding. Wrapping it with a captured lexical environment is a form of situational irony in programming (if not indeed verbal).
Regardless, you can't maintain state with closures in a pure language. You could close over some local "variable", but that variable would be immutable. In an impure language that would work, but then you're back to managing global state, even if that state is not defined at the top level (i.e. global in the sense that different closures could still have access to the same piece of state, concurrently).
[0] https://news.ycombinator.com/item?id=13965917
EDIT: just to clarify, monads manage state purely. I.e, instead of writing:
function statefulFunction(state) {
(a, state1) = f(state);
(b, state2) = g(a, state1);
(c, state3) = h(b, state2);
return (c, state3);
}
you can write: statefulFunction = do
a <- f
b <- g a
c <- h b
return c
But the resulting function is still pure, i.e. (state) => (c, state).Say you wanted to iterate over a list of image files and apply a transformation to them. In haskell, doing that work synchronously and doing it asynchronously can use the same function, just in a different monad, because the definition of bind determines the execution strategy.
The monad abstraction provides a way to encode a dependency between "operation 1" and "operation 2", or inject behaviour between "element 1" and "element 2" when processing a data structure.
The type signature of bind is "m a -> (a -> m b) -> m b". It might look complicated, but it states "given a monadic value with an inner type of a, and a function from a to a monadic value with inner type of b, you will get a monadic value with inner type of b"
This can be used to access data "inside" a type where you might not have direct access, in such a way that bind determines how the access happens so that it can enforce things like "The preceding IO must have been finished" or "The future must be completed" or "The list values must be concatenated" or "You must compose the continuations properly" or "If any operation results in Nothing, the result is Nothing"
Haskell IO is the usual example of this. Since the IO type is opaque, you cannot construct values of IO yourself directly. An IO String (such as getLine) represents an IO operation that results in a String value. Similarly, the "putStr" function (of type String -> IO ()) is a function that takes a string and results in an IO action that "does something and the value does not matter", or in this case, prints the string to the screen.
However, you can't just compose getLine and putStr directly, because the types don't match. You need to get the value "inside" getLine first.
This is what the bind function allows you to do. And it's defined for the IO type, in such a way that IO ordering is enforced. You can't not use bind, because you have no way of accessing the String value inside getLine without it.
I don't know if my explanation helps. The seeming "complexity" of monads arises from the fact that you can apply the same simple abstraction to so many different things, often in seemingly unintuitive ways.
There is a reason people say monads are simple.
The comment you are replying to is describing the concrete example of the List monad. To answer you direct questions, it is probably better to look at another example: the State monad.
The idea behind state is simple. Suppose you have a bunch of functions that you want to call in sequence. This is a simple enough problem; just do f(g(h(x))) or `h(x) |> g |> f` [0].
Now, suppose that, in addition to piping a value from one function to the next, we want to keep track of additional state. For example, maybe we want to use a psuedo-random number generator in some of the functions and have to keep track of its internal state. One way to do that would be to simply modify each function so that the returned value includes the state; but this quickly becomes unwieldy. The State monad provides a way to automatically thread the state through the functions. So we can do:
put initSeed >>= h >>= (get \currentSeed -> g currentSeed) >>= (get \nextSeed -> f nextSeed)
or (with a bit of syntactic sugar: put initSeed
h
currentSeed <- get
g currentSeed
nextSeed <- get
f nextSeed
in this example g and f would be expected to call put to update the internal state.Now, suppose you want to keep applying a function until some condition becomes true about the state. In this case, it makes sense to talk about a while loop:
while (get >>= \state -> state =/= 7) f
The above snippet will repeatadly bind f until the internal state is 7.It turns out, that we can implement while in terms of only the Monad interface:
while condition action = do
x <- condition
if x then (action >> (while condition action))
else return ()
It turns out that most control structures can be encoded into monads in this way.The IO system is a bit hacky. The idea is that the universe is a big state machine. So, when we do `println "Hello World"`, we are not causing a side effect, we are just updating our internal state to say that we printed "Hello World". Simmilarly, getLine just peaks into our internal state to see what the input at the current time has always been. Of course, for some reason, GHC does not give us the full get or put functions for the IO monad; but they could exist in theory, and that is enough to pretend that side effects do not exist.
[0] Not actual haskell because we can't agree on a name for the pipeline operator I am calling "|>" here.
So this is one example of a Monad. Go through the same process with different "datastructures" to see the power of it. First try with a "Maybe" monad, and then a "State" monad. The generalization of that idea is the monad.
The power comes not from the complexity, but from the realization that this simple idea applied in alternate contexts (ie on different datastructures) allows many powerfull solutions to problems.
The idea is fundamentally simple - applying that same simple idea in different contexts provides the value. It is a simple idea, it's just impressive how many different contexts it works in to solve problems. Problem domains that we normally would have felt have nothing in common.