Some Forths compile to native machine instructions and no VM is involved. Sometimes they compile a list of threaded jump addresses with an "interpreter" that's basically just two instructions: JSR [nextaddress++]; LOOP. Sometimes they just compile literal sequences of JSR instructions and there's no interpreter at all.
The wikipedia entry [0] contains some refs. The out-of-print book "Threaded Interpretive Languages" [1] was the definitive treatment of the topic, and what I mostly used back when I was writing Forth compilers.
There are 8 parts to the series, you can look at all of them and other Forth writings by the author on his website: http://www.bradrodriguez.com/papers/
For anyone else interested I found this book which is about hardware-based stack machine sand Forth.
To quote the book:
"While some of the material is obviously dated and new work has been done in this area, I believe that the book remains the principal reference work on Forth-style stack computers. "