You don't need any fancy data structures. 95% of the performance goes into glyph rendering. And with Unicode the performance gain from monospace fonts goes out the window as some Unicode characters are very large, and not only that they also take up many bytes. So one Unicode character can be up to 5 bytes long and take up the same canvas space as 3 characters. You also need to read ahead as there are combination characters, for example a smiley combined with the color brow becomes a brown smiley. I've blogged about implement support for Unicode in an editor here: https://xn--zta-qla.com//en/blog/editor10.htm
Some "single character" emoji easily exceed 5 bytes in all encodings. You may think ZWJ sequences are cheating, but emoji isn't the only language encoded in Unicode with complex ZWJ sequences.
My question is about the phrase up to 5. What in Unicode is up to 5? Codepoints are up to 4 in all the encodings I know. ZWJ sequences may as well be arbitrarily long. What is "up to 5"?
Yes, UTF-8 was up to 6 bytes per character early on. Some broken implementations like MySQL's limit it to up to 3 bytes per character. The actual number is 4.