Show HN: Fixing ChatGPT's losing-focus bug (opens in new tab)

(chatgpt-ui-fix.vercel.app)

2 pointsoceanparkway2y ago2 comments

When you select a paragraph in ChatGPT's UI as it's streaming in, your focus on the element is continuously lost each time a new text character streams in, so if you try and copy and select something mid-stream, it's a moving target. I wrote up a simple/quick fix.

I imagine landing the actual fix for them would be more difficult given they embed different elements alongside text, e.g. code blocks, which is where I first noticed this issue (trying to select a short snippit of code from a long block as it streams in).

2 comments

hidelooktropic2y ago

I love this and was trying to recreate this (it used to work on ChatGPT) using pure HTML and JS. Would love to know what exactly react is doing that allows it to be selectable in this way, does it append to text node? What is allowing to work if it's streaming a few layers down like an li inside a ul inside an li inside an ol?

oceanparkwayOP2y ago

The DOM allows multiple text nodes of an element.

When you update the React tree with a different/longer paragraph node, it's actually removing the element and replacing it with a new one and the selection state is blown away. This is solved in non-text elements by a very simple diffing algorithm that says, as long as it's not a list with .map (which why is why .map triggers a linter rule warning you to use keys), if it is an element with the same tag (<a ...> is the same as <a ...> but different than <img .../>) then React uses DOM nudging to modify the element as opposed to unmounting it from the DOM and replacing it.

Text nodes work differently in the DOM. If you open up devtools and edit a paragraph while its selected in the browser, you'll see the same effect. Try doing the manipulation with JS, same thing (document.querySelector(a_selector).innerText += 'sdfsdf').

Using Fragments instructs React to append multiple text nodes to the same parent, which means we're simply appending nodes instead of blowing them away, which solves the selection state issue.

All made possible by the fact that browsers just cleanly reconstitute the text fragments back together into display text.

Unfortunately this approach does seem to break the built-in apple screenreader, but then I placed the screenreader on ChatGPT and saw how it already totally was failing. It seems surprising to me that OpenAI wouldn't have addressed this access issue. I guess there may be a whole extensive implementation issue with streaming text and screenreaders in general.

j / k navigate · click thread line to collapse