The global write lock, while important, is probably the most over-written-about and misunderstood issue. The writelock is only held while the db is actually in the middle of an write and tends to only be held for a very brief period (about 10s-100s of microseconds or less) at a time. For example, it is never held across multiple user actions and therefore is closer to a latch than a lock in common RDBMS terminology. The main cases where this caused issues for users are when some data we needed wasn't already in memory so we went to disk to fetch it (~10ms rather than 10us). One of the major improvements to 2.0 was a framework to yield the lock to allow other work while going to disk that is used in the places where this was seen most frequently in production. 2.2 will plug this in more places with the goal to never hold the lock when going to disk. This work is of course in parallel to work to move to DB, collection, and possibly extent-level locking replacing the global lock in almost all cases.
As for safe-mode, the waiting that it does is always outside of the lock. It uses a cached datastructure that is protected by a secondary mutex so that it never has to interact with the global dbMutex. If you choose to wait for replication or journalling of your writes, that will block the connection and therefore your client thread so single-threaded tests will show much worse performance with the db mostly idle. If you use more client threads or asynchronous I/O connections you should see roughly identical throughput in aggregate (see mongostat) although much higher latencies compared with not waiting.
If you have any questions about this feel free to shoot me an email. Replace the _ with an @ and add a .com to my user name.