Rendered it in the right pane, instead of inline. Dark theme. 2% of Daily limit.
edit: claude just confirmed the initial version has a bug and 104-117 are not visible
It responds with the statistically most probable text based on its training data, which happens to be different with the errors vs without. I suspect high-fidelity diagramming requires a different attention architecture from the common ones used in sentence-optimized models.
Gemini, ChatGPT or Grok would find this a lot easier as they could gen an image inline, although IP restrictions might bite you. Even Grok wants to lecture on IP these days, but at least it's fairly trivial to jailbreak.
It provides both syntax guides and syntax/semantic analysis as MCP Tools, so you can have an agent iteratively refine diagrams with good context for patterns like multi-line text and comments (LLMs love end-of-line comments, but Mermaid.js often doesn’t).
What instance of ChatGPT are you doing that with? (Reasoning?)
I've noticed the same thing when creating an agentic loop, if the model outputs a syntax error, just automatically feed it back to the LLM and give it a second chance. It dramatically increases the success rate.
Like a much prettier version of Mermaid.
Kudos, Anthropic. Geez, this is so nice.
Now I'm going to ask it to draw a diagram of a pelican riding a bicycle, why not?
Great for summarizing a multi-step process and quick to render with simple tools.
If there's humans involved, "I took this data and made a really fancy interactive chart" means that you put a lot more work into it, and you can probably somewhat assume that this means some more effort was also put into the accuracy of the data.
But with the LLM it's not really very much more work to get the fancy chart. So the thing that was a signifier of effort is now misleading us into trusting data that got no extra effort.
(Humans have been exploiting this tendency to trust fancy graphics forever, of course.)
There has always been a bias towards form over function.
P.S. Credit to the poster, she posted a correction note when someone caught the issue: https://www.linkedin.com/posts/mariamartin1728_correction-on...
Honestly, people make them up just as much or generate equally incorrect graphs.
It's about time our trust into random visualizations is destroyed, without the actual formulas and data behind being exposed.
People find them quite easy to check - easier than the raw document. My angle with teams is use these to check your processes. If the flow is wrong it’s either because the LLM has screwed up, or because the policy is wrong/badly written. It’s usually the latter. It’s a good way to fix SOPs
I'm finding more and more often the limiting factor isn't the LLM, it's my intuition. This goes a way towards helping with that.
https://www.reddit.com/r/dataisugly/comments/1mk5wdb/this_ch...
I mean is it really that shocking that you can have an LLM generate structured data and shove that into a visualizer? The concern is if is reliable, which we know it isnt.
Passive questions generate passive responses.
I usually use a lot of other tools for data analysis or write code with Claude code or another LLM to do data analysis and visualization.
article about the ChatGPT charts and graphs https://www.zdnet.com/article/how-to-use-chatgpt-to-make-cha...
It's pretty bad (for me). I have to use extremely prescriptive language to tell ChatGPT what to create. Even down to the colours in the chart, because otherwise it puts black font on black background (for example). Then I have to specifically tell it to put it in a canvas, and make it interactive, and make it executable in the canvas. Then if I'm lucky I have to hit a "preview" button in the top right and hope it works (it doesn't). I could write several paragraphs telling it to do something like what Claude just demo'd and it wouldn't come close. I'm trying Claude now for financial insights and it's effortless with beautiful UX.
For posterity, Gemini is pretty good with these interactive canvases. Not nearly as good, but FAR better than ChatGPT.
They write 100% of their code with Claude. Some of their engineers apparently burn over 100k worth of tokens per month.
It’s not surprising they ship fast at all when the product is actually falling apart at the seams and they just vibe code everything.
"If brute force doesn't work, you aren't using enough of it." - Isaac Arthur
But you can Sign in with Google.
If you signed up with your Apple on the iOS Claude app, to access your account on the computer, you have to open the passwords app and copy your random email address and paste it into the Claude website login.
Also if you try to copy-paste a prompt from Notes etc into the Claude chat, it gets added as an attachment, so you can't edit the prompt. If you do the four-finger shortcut to paste it as text, it mangles newlines etc.
Why are they so dumb about such basic UX for so long?
Apple forces developers to offer Sign in with Apple on iOS devices if any other sign in service is used. Apple can't force them to do it on non-Apple platforms.
Isn't this basically Apple's fault? When you signed up, Apple provided a fake email address in leu of your real one. This is great for privacy but means the service has the wrong email.
I'm sure they didn't want to provide an Apple sign in option at all, but it's required by App Store rules.
https://claude.ai/public/artifacts/1bded4db-c4c2-4089-aa36-5...
Honestly, I initially thought that everyone already does it, amazing it seems they don't yet - neither teachers, nor class. The artefact was created with care and love through a very long conversation, so this is not a 1-shot slop, rather a cared-for-slop :D. Besides I don't think it is easy to get this right from the first time, and the model usually expounds on the irrelevant details if not properly guided by a human hand.
Next up: exporting or sharing selections from the chat as a document or interactive page. If they allow share with non-subscribers, subscriptions could hockey stick -- particularly if the document/page included prompts necessary to replicate (or modify and adapt).
(Literally nobody needs an image of a cake when asking for a cake recipe)
https://petergpt.github.io/bullshit-benchmark/viewer/index.v...