The main difference is that plan9 uses read and write for everything, whereas Linux and BSD uses ioctls on file descriptors for everything.
And at that point, the whole "everything is" turns into nonsense, because yes, everything is a pointer to something, so what.
The difference to plan9 is not the files, but the way plan9 uses text protocols with read/write to ctl files. To open a TCP connection - if memory serves me right - you first have to write to a ctl file, which creates a new file for the connection. Then, you write the dial command to the ctl file of that connection, and after which you can open the connection file. On Linux, a syscall creates an anonymous file, and then everything after is operations on this anonymous file.
There's some ideological benefits, but plan9 creates a mess of implicit text protocols, ugly string handling, syscall storms and memory inefficiencies. Their design is pretty much solely a limitation caused by the idea that all filesystems should exist through the 9p protocol, which as a networked protocol cannot share data (fds, structs), only copy (payloads to/from read, write). With the idea that all functionality had to be replaceable and mountable from remote machines, the only possible API became read/write for everything.
I'd argue that fd-using syscalls and ioctls - basically per-file syscalls - is a superior approach to implement everything-as-a-file.
If I've got a Plan9 system mounted over, say, NFS, would this all mean that (ignoring permissions) I could effectively open a TCP connection from that remote machine by writing appropriate information to a file on the NFS share? It would be pretty inefficient I suspect, tunnelling TCP over NFS, but it seems like there could be an incredible amount of cool hacks that might Just Work as a side-effect of them going all-in on "everything is a file".
There was perhaps a time where the differences between having everything in binary ioctls and bound to specifically one device was a necessary component in order to reach reasonable performance, but I don't believe that is the case anymore. Anecdotally these days everything on Plan 9 feels snappier. We have some benchmarks that show that 9front outperforms linux with naive pipe io and context switches. What Plan 9 misses in micro optimizations it makes up for by having a incredibly consistent and versatile base.
I want to reiterate the benefits of the network transparency by talking about how drawterm works. Drawterm can be thought of the plan 9 equivalent of windows RDP. How it works is that internally drawterm creates routines to expose a /dev/draw, /dev/mouse and /dev/keyboard through whichever native way there is on the target system (macos, windows, linux, etc). It then attaches to the remote system and overlays these files over a namespace. Programs like our window manager rio can then be run completely transparently, forwarding not compressed images, but individual draw RPC messages. There is no need for any special code on the plan 9 host side in order to accommodate drawterm, again it is something that just falls out of the core design of the system.
It's not clear to me that 9p itself could not be extended to allow for shared memory. With low-level control over the operating system and rebuilding of existing binaries, distributed shared memory becomes a possibility. (I.e. the existing VM system ought to be enough to implement whatever cache coherence is needed for shared memory over the network.)
Whats magical about the segmet(3)[1] device? The '#' devices are kernel file servers. There's no magic.