seriously, the whole issue of whether threaded code is really running in parallel or not (i.e. whether adding more cpus will make the code run faster) is misleading.
Context switches without the compiler being explicitly aware of when this happens can yield similar issues whether the context switch is done in software or if memory accesses are interleaved because the code is genuinely running on multiple execution units.
The problem stems from the fact that both the compiler and the processor might perform memory access in a different order than what you'd expect. I'd suggest an interesting read about it at http://ridiculousfish.com/blog/posts/barrier.html
Asynchronous programming allows to process effectively one event at a time, where things happen exactly as defined by a simple programming model, and the compiler can know what it can safely be done to produce the requested side effects.
If the grain of the events is fine enough you can reach the same effect as being concurrent, from the point of view of task being performed, while actually there is nothing really concurrent from the point of view of the actual code that is running.
Thus, it's not about the definition of concurrency per se, but about what is being concurrent in the system.