I'm not sure we disagree. In my experience you can get "pretty fast" but not "c-like fast" while keeping many language features. The more you chase c-like performance (or even better fortran), the more you start restructuring things in a c-like way. This isn't universal of course, and I don't think it has anything specific to do with c language, just that it is a semi reasonable proxy for hardware architecture (ignoring SIMD).
Anyway, this isn't really Julia specific, and I haven't tried with Julia recently so I may be wrong :)