Data is immutable, so we don't have to worry about keeping data coherent between.. anything. Whether it is two processes or two nodes. New data is can be constructed with reference to old data without fear that the old data will be modified. So, "mutation" is really just new data with a reference to the old unchanged data. This greatly lowers the churn in creating new data. It also means everything can just pass (process to process or node to node) what it has without feature it will be out of date.
Everything is defined in modules. Modules define what we would think of in OOP as namespaces, structures, classes/types, and class functions. Importantly, they only define functionality. Modules do not have state. Therefore, functions accept some set of inputs, create new data from the inputs (no mutation), and return some output. This makes it very easy reason about what the code is doing if you keep the modules well defined and reasonably sized. This code can be shared around easily, too. It's got no state and is immutable.
Processes are an abstraction. You can think of them as thread, but they're really just a stack and and a little book keeping. A BEAM VM will normally of real threads equal to the number of CPUs in the machine. Each real thread will then exclusively pick a process, load the book keeping, point itself to the stack, and execute bytecode for a period of time. When done, it will mark the changes in the book keeping, and move to the next process. This is very lightweight, so literally millions can run on a single computer. Because they are self contained, they're easy to clean up. Processes also expose standard set of interfaces for communication, a pub/sub system. Again, immutable messages are sent back and forth. So, it doesn't matter if it's the same node or not.
Finally, everything is abstracted to the notion of nodes with in a cluster. By default, anything you executes on the local node, but you can specify otherwise. I can execute a module call on another machine or spawn a new process on another machine. It just means a little more information in the call, but it's the same exact concept programmatically. Also, it's possible group processes into named services. You can call a named service and it will know what processes to contact. It's a very low barrier to entry to parallelize your code if you just write it that way.
When you start thinking in terms of how structure you code for BEAM, you inherently get easy access to scalability.