For background:
- I've seen the Python (2.x) way (separate `unicode' and `str' type and no way to set own conversion defaults per whole process -_-') and found it unwieldy. Lotsa boilerplate to keep around I/O.
- I've seen the GNU way (messy pile depending on LC_* environment vars and files in /usr/, with hierarchy of precedence) and disliked it.
- I've seen the Plan 9 way (char * / char[] for UTF-8 and Rune * / Rune[] for UTF-32) and it fits me. I see and use (non-PHP) software built on that, works reliably. I'm happy :)
implode() and explode() work with UTF-8. I hope you know the techicalities of why -- UTF-8's been designed that way and the functions take complete string as the glue/separator. No idea about about UTF-7, -16 or -32 (nor the April Fool's -18 and -36); guess some of those would break terribly. I don't care at this moment. My sites run polish, english, russian and in near future czech language based on UTF-8.
Your point is just strtok(). You are right. I lost. PHP lost. Have a nice day -- and kudos for bringing up that cool function :-)
EDIT: now that I think about it, it seems I've abused explode() for that job -- the output array, with empty strings removed, is equivalent of calling strtok() repeatedly till it returns FALSE. Seems a bit brute-force-ish.
AS for that warning against, and undefined behavior, I'll assume you're refering to the following passage:
It is not recommended to use the function overloading option in the per-directory context, because it's not confirmed yet to be stable enough in a production environment and may lead to undefined behaviour.
Reading comprehension: don't re-configure it
per directory (in this case, of your project, not whole server). Having that minded, you get reliable site.