p.s. Long time fan of the TokBox team and their easy to implement platform.
Tokbox is probably hosting the STUN/TURN servers for you. If you want to host them yourself, Twilio offers a cloud service for hosting them, or you can just deploy any open source solution to a server you control.
But there are definitely more parts, TURN/STUN for network traversal is just the beginning. The OpenTok platform behind this sample app adds dynamic optimizations (like audio fallback), archiving (recording) capabilities, RESTful APIs for eventing and control, etc. Just standing up a STUN/TURN server isn't going to solve the hard problems it takes to make something that can eventually go to production.
EDIT: spacing, clarity