story
Representing all languages is ok as a goal -- adding klingon and BS emojis not so much (from a sanity perspective, if adding them meddled with having a logical and simple representation of characters).
So, it comes to "the fact that some visible characters are made up of many graphemes the number of single code points would be huge" and "while some languages it's feasable to normalize them to single code points but other langagues it would not be".
Wouldn't 32 bits be enough for all possible valid combinations? I see e.g. that: "The largest corpus of modern Chinese words is as listed in the Chinese Hanyucidian (汉语辞典), with 370,000 words derived from 23,000 characters".
And how many combinations are there of stuff like Hangul? I see that's 11,172. Accents in languages like Russian, Hungarian, Greek should be even easier.
Now, having each accented character as a separate might take some lookup tables -- but we already require tons of complicated lookup tables for string manipulation in UTF-8 implementations IIRC.