I failed to recreate the 1996 Space Jam website with Claude (opens in new tab)

(j0nah.com)

552 pointsthecr0w6mo ago460 comments

460 comments

242 comments · 88 top-level

thuttinger6mo ago· 21 in thread

Claude/LLMs in general are still pretty bad at the intricate details of layouts and visual things. There are a lot of problems that are easy to get right for a junior web dev but impossible for an LLM. On the other hand, I was able to write a C program that added gamma color profile support to linux compositors that don't support it (in my case Hyprland) within a few minutes! A - for me - seemingly hard task, which would have taken me at least a day or more if I didn't let Claude write the code. With one prompt Claude generated C code that compiled on first try that:

- Read an .icc file from disk

- parsed the file and extracted the VCGT (video card gamma table)

- wrote the VCGT to the video card for a specified display via amdgpu driver APIs

The only thing I had to fix was the ICC parsing, where it would parse header strings in the wrong byte-order (they are big-endian).

jacquesm6mo ago

Claude didn't write that code. Someone else did and Claude took that code without credit to the original author(s), adapted it to your use case and then presented it as its own creation to you and you accepted this. If a human did this we probably would have a word for them.

mlinsey6mo ago

Certainly if a human wrote code that solved this problem, and a second human copied and tweaked it slightly for their use case, we would have a word for them.

Would we use the same word if two different humans wrote code that solved two different problems, but one part of each problem was somewhat analogous to a different aspect of a third human's problem, and the third human took inspiration from those parts of both solutions to create code that solved a third problem?

What if it were ten different humans writing ten different-but-related pieces of code, and an eleventh human piecing them together? What if it were 1,000 different humans?

I think "plagiarism", "inspiration", and just "learning from" fall on some continuous spectrum. There are clear differences when you zoom out, but they are in degree, and it's hard to set a hard boundary. The key is just to make sure we have laws and norms that provide sufficient incentive for new ideas to continue to be created.

6 more replies

bsaul6mo ago

That's an interesting hypothesis : that LLM are fundamentally unable to produce original code.

Do you have papers to back this up ? That was also my reaction when i saw some really crazy accurate comments on some vibe coded piece of code, but i couldn't prove it, and thinking about it now i think my intuition was wrong (ie : LLMs do produce original complex code).

7 more replies

ekropotin6mo ago

> If a human did this we probably would have a word for them.

What do you mean? The programmers work is literally combining the existing patterns into solutions for problems.

Mtinie6mo ago

> If a human did this we probably would have a word for them.

I don’t think it’s fair to call someone who used Stack Overflow to find a similar answer with samples of code to copy to their project an asshole.

3 more replies

Aeolun6mo ago

Software engineer? You think I cite all the code I’ve ever seen before when I reproduce it? That I even remember where it comes from?

1 more reply

ineedasername6mo ago

>we probably would have a word for them

Student? Good learner? Pretty much what everyone does can be boiled down to reading lots of other code that’s been written and adapting it to a use case. Sure, to some extent models are regurgitating memorized information, but for many tasks they’re regurgitating a learned method of doing something and backfilling the specifics as needed— the memorization has been generalized.

raincole6mo ago

This is why ragebait is chosen as the word of 2025.

> took that code without credit to the original author(s), adapted it to your use case

Aka software engineering.

fooker6mo ago

> If a human did this we probably would have a word for them.

Humans do this all the time.

FanaHOVA6mo ago

Are you saying that every piece of code you have ever written contains a full source list of every piece of code you previously read to learn specific languages, patterns, etc?

Or are you saying that every piece of code you ever wrote was 100% original and not adapted from any previous codebase you ever worked in or any book / reference you ever read?

2 more replies

martin-t6mo ago

Programmers are willingly blind to this, at least until it's their code being stolen or they lose their job.

_LLMs are lossily compressed archives of stolen code_.

Trying to achieve AI through compression is nothing new.[0] The key innovation[1] is that the model[2] does not output only the first order input data but also the higher order patterns from the input data.

That is certainly one component of intelligence but we need to recognize that the tech companies didn't build AI, they build a compression algorithm which, combined with the stolen input text, can reproduce the input data and its patterns in an intelligent-looking way.

[0]: http://prize.hutter1.net/

[1]: Oh, god, this phrase is already triggering my generated-by-LLM senses.

[2]: Model of what? Of the stolen text. If 99.9999% of the work to achieve AI wasn't done by people whose work was stolen, they wouldn't be called models.

FeepingCreature6mo ago

This is not how LLMs work.

giancarlostoro6mo ago

You mean like copying and pasting code from Stack Overflow?

nvllsvm6mo ago

> Someone else did

Who?

kevinsync6mo ago

I've been struggling with this throughout the entire LLM-generated-code arc we're currently living -- I agree that it is wack in theory to take existing code and adapt it to your use-case without proper accreditation, but I've also been writing code since Pulp Fiction was in theaters and a lot of it is taking existing code and adapting it to my use-case, sometimes without a fully-documented paper trail.

Not to mention the moral vagaries of "if you use a library, is the complete articulation of your thing actually 100% your code?"

Is there a difference between loading and using a function from ImageMagick, and a standalone copycat function that mimics a function from ImageMagick?

What if you need it transliterated from one language to another?

Is it really that different than those 1200 page books from the 90's that walk you through implementing a 3D engine from scratch (or whatever the topic might be)? If you make a game on top of that book's engine, is your game truly yours?

If you learn an algorithm in some university class and then just write it again later, is that code yours? What if your code is 1-for-1 a copy of the code you were taught?

It gets very murky very quick!

Obviously I would encourage proper citation, but I also recognize the reality of this stuff -- what if you're fully rewriting something you learned decades ago and don't know who to cite? What if you have some code snippet from a website long forgotten that you saved and used? What if you use a library that also uses a library that you're not aware of because you didn't bother to check, and you either cite the wrapper lib or cite nothing at all?

I don't have some grand theory or wise thoughts about this shit, and I enjoy the anthropological studies trying to ascertain provenance / assign moral authority to remarkable edge cases, but end of the day I also find it exhausting to litigate the use of a tool that exploited the fact that your code got hoovered up by a giant robot because it was public, and might get regurgitated elsewhere.

To me, this is the unfortunate and unfair story of Gregory Coleman [0] -- drummer for The Winstons, who recorded "Amen, Brother" in 1969 (which gave us the most-sampled drum break in the world, spawned multiple genres of music, and changed human history) -- the man never made a dime from it, never even knew, and died completely destitute, despite his monumental contribution to culture. It's hard to reconcile the unjustness of it all, yet not that hard to appreciate the countless positive things that came out of it.

I don't know. I guess at the end of the day, does the end justify the means? Feels pretty subjective!

[0] https://en.wikipedia.org/wiki/Amen_break

1 more reply

idiotsecant6mo ago

Yes, the word for that is software developer.

littlecranky676mo ago

> Claude/LLMs in general are still pretty bad at the intricate details of layouts and visual things

Because the rendered output (pixels, not HTML/CSS) is not fed as data in the training. You will find tons of UI snippets and questions, but they rarely included screenshots. And if they do, the are not scraped.

Wowfunhappy6mo ago

Interesting thought. I wonder if Anthropic et al could include some sort of render-html-to-screenshot as part of the training routine, such that the rendered output would get included as training data.

2 more replies

ubercow136mo ago

Why wouldn't they be?

1 more reply

chongli6mo ago

Why is this something a Wayland compositor (a glorified window manager) needs to worry about? Apple figured this out back in the 1990s with ColorSync and they did it once for the Mac OS and any application that wanted colour management could use the ColorSync APIs.

hedgehog6mo ago

Color management infrastructure is intricate. To grossly simplify: somehow you need to connect together the profile and LUT for each display, upload the LUTs to the display controller, and provide appropriate profile data for each window to their respective processes. During compositing then convert buffers that don't already match the output (unmanaged applications will probably be treated as sRGB, color managed graphics apps will opt out of conversion and do whatever is correct for their purpose).

1 more reply

wilsmex6mo ago· 18 in thread

Well this was interesting. As someone who was actually building similar website in the late 90's I threw this into the Opus 4.5. Note the original author is wrong about the original site however:

"The Space Jam website is simple: a single HTML page, absolute positioning for every element, and a tiling starfield GIF background.".

This is not true, the site is built using tables, not positioning at all, CSS wasn't a thing back then...

Here was its one-shot attempt at building the same type of layout (table based) with a screenshot and assets as input: https://i.imgur.com/fhdOLwP.png

thecr0wOP6mo ago

Thanks, my friend. I added a strike through of the error, a correction, and credited you.

I'm keeping it in for now because people have made some good jokes about the mistake in the comments and I want to keep that context.

wilsmex6mo ago

You bet, Fun post and writeup, took me a bit down memory lane. I built several sites with nested table-based layouts, 1x1 transparent gif files set to strange widths to get layouts to force certain sizes. Little tricks with repeating gradient backgrounds for fancy 'beveled' effects. Under construction GIFs, page counters, GUESTBOOKS!, Photoshop drop-shadows on everything. All the things, fond-times. One or two I haven't touched in 20 years, but keep online for my own time-capsule memory :)

1 more reply

govping6mo ago

The failure mode here (Claude trying to satisfy rather than saying 'this is impossible with the constraints') shows up everywhere. We use it for security research - it'll keep trying to find exploits even when none exist rather than admit defeat. The key is building external validation (does the POC actually work?) rather than trusting the LLM's confidence.

yard20106mo ago

Ah! I see the problem now! AI can't see shit, it's a statistical model not some form of human. It uses words, so like humans, it can say every shit it wants and it's true until you find out.

The number one rule of the internet is don't believe anything you read. This rule was lost in history unfortunately.

3 more replies

manbash6mo ago

Ah, those days, where you would slice your designs and export them to tables.

chrisweekly6mo ago

I remember building really complex layouts w nested tables, and learning the hard way that going beyond 6 levels of nesting caused serious rendering performance problems in Netscape.

2 more replies

reconnecting6mo ago

Why not! We did this in 2024 for our website (1) to have zero CSS.

Still works, only Claude can not understand what those tables means.

1. https://www.tirreno.com

3 more replies

thecr0wOP6mo ago

I learned recently that this is still how a lot of email html get generated.

2 more replies

ralferoo6mo ago

It was relatively OK to deal with when the pages were created by coders themselves.

But then DreamWeaver came out, where you basically drew the entire page in 2D and it spat out some HTML tables that stitched it all back together again, and the freedom it gave our artists in drawing in 2D and not worrying about the output meant they went completely overboard with it and you'd get lots of tiny little slices everywhere.

Definitely glad those days are well behind us now!

1 more reply

mmanfrin6mo ago

I yearn for those days. CSS was a mistake. Tables and DHTML is all one needs.

2 more replies

gregoryl6mo ago

Gosh, there was a website, where you submit a PSD + payment, and they spit out a sliced design. Initially tables, later, CSS. Life saver.

1 more reply

jweir6mo ago

And use a single px invisible gif to move things around.

But was Space Jam using multiple images or just one large image with and image map for links?

1 more reply

bigbuppo6mo ago

Oh man, Photoshop still has the slice feature and it makes the most horrendous table-based layout possible. It's beautiful.

alt2276mo ago

Off topic, but you have used imgur as your image hosting site, which cannot be viewed in the UK. If you want all readers to be able to see and understand your points, please could you use a more universally reachable host?

M4v3R6mo ago

Please reach out to your nearest government official to tell them what do you think about the Imgur not working in your country.

2 more replies

master-lincoln6mo ago

Which one could be used so everybody can read it? So many different autocratic systems to think about...

I think it's easier if you adapt and get a VPN or a new government.

1 more reply

Aeolos6mo ago

Why is that?

1 more reply

johnebgd6mo ago

I cut my teeth developing for the web using GoLive and will never forget how they used tables to layout a page from that tool…

9999000009996mo ago· 13 in thread

Space Jam website design as an LLM benchmark.

This article is a bit negative. Claude gets close , it just can't get the order right which is something OP can manually fix.

I prefer GitHub Copilot because it's cheaper and integrates with GitHub directly. I'll have times where it'll get it right, and times when I have to try 3 or 4 times.

GeoAtreides6mo ago

>which is something OP can manually fix

what if the LLM gets something wrong that the operator (a junior dev perhaps) doesn't even know it's wrong? that's the main issue: if it fails here, it will fail with other things, in not such obvious ways.

alickz6mo ago

>what if the LLM gets something wrong that the operator (a junior dev perhaps) doesn't even know it's wrong?

the same thing that always happens if a dev gets something wrong without even knowing it's wrong - either code review/QA catches it, or the user does, and a ticket is created

>if it fails here, it will fail with other things, in not such obvious ways.

is infallibility a realistic expectation of a software tool or its operator?

1 more reply

godelski6mo ago

I think that's the main problem with them. It is hard to figure out when they're wrong.

As the post shows, you can't trust them when they think they solved something but you also can't trust them when they think they haven't[0]. The things are optimized for human preference, which ultimately results in this being optimized to hide mistakes. After all, we can't penalize mistakes in training when we don't know the mistakes are mistakes. The de facto bias is that we prefer mistakes that we don't know are mistakes than mistakes that we do[1].

Personally I think a well designed tool makes errors obvious. As a tool user that's what I want and makes tool use effective. But LLMs flip this on the head, making errors difficult to detect. Which is incredibly problematic.

[0] I frequently see this in a thing it thinks is a problem but actually isn't, which makes steering more difficult.

[1] Yes, conceptually unknown unknowns are worse. But you can't measure unknown unknowns, they are indistinguishable from knowns. So you always optimize deception (along with other things) when you don't have clear objective truths (most situations).

smallnix6mo ago

That's not the point of the article. It's about Claude/LLM being overconfident in recreating pixel perfect.

jacquesm6mo ago

All AI's are overconfident. It's impressive what they can do, but it is at the same time extremely unimpressive what they can't do while passing it off as the best thing since sliced bread. 'Perfect! Now I see the problem.'. 'Thank you for correcting that, here is a perfect recreation of problem 'x' that will work with your hardware.' (never mind the 10 glaring mistakes).

I've tried these tools a number of times and spent a good bit of effort on learning to maximize the return. By the time you know what prompt to write you've solved the problem yourself.

bigstrat20036mo ago

> it just can't get the order right which is something OP can manually fix.

If the tool needs you to check up on it and fix its work, it's a bad tool.

markbao6mo ago

“Bad” seems extreme. The only way to pass the litmus test you’ve described is for a tool to be 100% perfect, so then the graph looks like 99.99% “bad tool” until it reaches 100% perfection.

It’s not that binary imo. It can still be extremely useful and save a ton of time if it does 90% of the work and you fix the last 10%. Hardly a bad tool.

It’s only a bad tool if you spent more time fixing the results than building it yourself, which sometimes used to be the case for LLMs but is happening less and less as they get more capable.

1 more reply

godelski6mo ago

I wouldn't go that far, but I do believe good tool design tries to make its failure modes obvious. I like to think of it similar to encryption: hard to do, easy to verify.

All tools have failure modes and truthfully you always have to check the tool's work (which is your work). But being a master craftsman is knowing all the nuances behind your tools, where they work, and more importantly where they don't work.

That said, I think that also highlights the issue with LLMs and most AI. Their failure modes are inconsistent and difficult to verify. Even with agents and unit tests you still have to verify and it isn't easy. Most software bugs are created from subtle things, often which compound. Which both those things are the greatest weaknesses of LLMs: nuance and compounding effects.

So I still think they aren't great tools, but I do think they can be useful. But that also doesn't mean it isn't common for people to use them well outside the bounds of where they are generally useful. It'll be fine a lot of times, but the problem is that it is like an alcohol fire[0]; you don't know what's on fire because it is invisible. Which, after all, isn't that the hardest part of programming? Figuring out where the fire is?

[0] https://www.youtube.com/watch?v=5zpLOn-KJSE

mrweasel6mo ago

That's my thinking. If I need to check up on the work, then I'm equally capable of writing the code myself. It might go faster with an LLM assisting me, and that feels perfectly fine. My issue is when people use the AI tools to generate something far beyond their own capabilities. In those cases, who checks the result?

wvenable6mo ago

Perfection is the enemy of good.

thecr0wOP6mo ago

ya, this is true. Another commenter also pointed out that my intention was to one-shot. I didn't really go too deeply into trying to try multiple iterations.

This is also fairly contrived, you know? It's not a realistic limitation to rebuild HTML from a screenshot because of course if I have the website loaded I can just download the HTML.

swatcoder6mo ago

> rebuild HTML from a screenshot

???

This is precisely the workflow when a traditional graphic designer mocks up a web/app design, which still happens all the time.

They sketch a design in something like Photoshop or Illustrator, because they're fluent in these tools and many have been using them for decades, and somebody else is tasked with figuring out how to slice and encode that design in the target interactive tech (HTML+CSS, SwiftUI, QT, etc).

Large companies, design agencies, and consultancies with tech-first design teams have a different workflow, because they intentionally staff graphic designers with a tighter specialization/preparedness, but that's a much smaller share of the web and software development space than you may think.

There's nothing contrived at all about this test and it's a really great demonstration of how tools like Claude don't take naturally to this important task yet.

1 more reply

Retric6mo ago

It’s not unrealistic to want to revert to an early version of something you only have a screenshot of.

sigseg1v6mo ago· 9 in thread

Curious if you've tested something such as:

- "First, calculate the orbital radius. To do this accurately, measure the average diameter of each planet, p, and the average distance from the center of the image to the outer edge of the planets, x, and calculate the orbital radius r = x - p"

- "Next, write a unit test script that we will run that reads the rendered page and confirms that each planet is on the orbital radius. If a planet is not, output the difference you must shift it by to make the test pass. Use this feedback until all planets are perfectly aligned."

Aurornis6mo ago

This is my experience with using LLMs for complex tasks: If you're lucky they'll figure it out from a simple description, but to get most things done the way you expect requires a lot of explicit direction, test creation, iteration, and tokens.

One of the keys to being productive with LLMs is learning how to recognize when it's going to take much more effort to babysit the LLM into getting the right result as opposed to simply doing the work yourself.

jazzyjackson6mo ago

Re: tokens, there is a point where you have to decide what's worth it to you. I'd been unimpressed with what I could get out of chat apps but when I wanted to do a rails app that would cost me thousands in developer time and some weeks between communication zoom meetings and iteration... I bit the bullet and kept topping up Claude API and spent about $500 on Opus over the course of a weekend, but the site is done and works great.

jacquesm6mo ago

It would not be the first time that an IT services provider makes more money the worse their products perform.

1 more reply

thecr0wOP6mo ago

Hm, I didn't try exactly this, but I probably should!

Wrt unit test script, let's take Claude out of the equation, how would you design the unit test? I kept running into either Claude or some library not being capable of consistently identifying planet vs non planet which was hindering Claude's ability to make decisions based on fine detail or "pixel coordinates" if that makes sense.

cfbradford6mo ago

Do you give Claude the screenshot as a file? If so I’d just ask it to write a tool to diff each asset to every possible location in the source image to find the most likely position of each asset. You don’t really need recognition if you can brute force the search. As a human this is roughly what I would do if you told me I needed to recreate something like that with pixel perfect precision.

1 more reply

yfontana6mo ago

If I were to do this (and I might give it a try, this is quite an interesting case), I would try to run a detection model on the image, to find bounding boxes for the planets and their associated text. Even a small model running on CPU should be able to do this relatively quickly.

bluedino6mo ago

Congratulations, we finally created 'plain English' programming languages. It only took 1/10th of the worlds electricity and 40% of the semiconductor production.

turnsout6mo ago

Yes, this is a key step when working with an agent—if they're able to check their work, they can iterate pretty quickly. If you're in the loop, something is wrong.

That said, I love this project. haha

monsieurbanana6mo ago

I'm trying to understand why this comment got downvoted. My best guess is that "if you're in the loop, something is wrong" is interpreted as there should be no human involvement at all.

The loop here, imo, refers to the feedback loop. And it's true that ideally there should be no human involvement there. A tight feedback loop is as important for llms as it is for humans. The more automated you make it, the better.

1 more reply

manlymuppet6mo ago· 7 in thread

Couldn’t you just feed Claude all the raw, inspect element HTML from the website and have it “decrypt” that?

The entire website is fairly small so this seems feasible.

Usually there’s a big difference between a website’s final code and its source code because of post processing but that seems like a totally solvable Claude problem.

Sure LLMs aren’t great with images, but it’s not like the person who originally wrote the Space Jam website was meticulously messing around with positioning from a reference image to create a circular orbit — they just used the tools they had to create an acceptable result. Claude can do the same.

Perhaps the best method is to re-create, rather than replicate the design.

blks6mo ago

What do you mean? Raw html is the original website source code.

Modern web development completely poisoned young generation

manlymuppet6mo ago

I'm using source code like it's used when referring to source code vs executables. React doesn't simply spit out HTML, nor the JSX used to write said React code, it outputs a mixture of things that's the optimized HTML/CSS/JS version of the React you wrote. This is akin to source code and the optimized binaries we actually use.

Perhaps the wrong usage of "source code". I probably should've been more precise. Forgive my lack of vocabulary to describe the difference I was referring to.

2 more replies

personjerry6mo ago

If you have the raw HTML why would you need to do this at all?

manlymuppet6mo ago

I should've been more precise with my words.

What I meant is doing inspect element on the Space Jam website, and doing select all + copy.

1 more reply

manlymuppet6mo ago

https://pastebin.com/raw/F2jxZTeJ

The HTML I'm referring to, copied from the website.

Only about 7,000 characters or just 2,000 Claude tokens. This is feasible.

valleyer6mo ago

There's actually a language model developed by The Open Group that will transform that back into the original source code.

<https://pubs.opengroup.org/onlinepubs/9799919799/utilities/c...>

2 more replies

literalAardvark6mo ago

The space jam website used HTML tables for formatting and split images in each cell.

CSS didn't exist.

smoghat6mo ago· 6 in thread

Ok, so here is an interesting case where Claude was almost good enough, but not quite. But I’ve been amusing myself by taking abandoned Mac OS programs from 20 years ago that I find on GitHub and bringing them up to date to work on Apple silicon. For example, jpegview, which was a very fast and simple slideshow viewer. It took about three iterations with Claude code before I had it working. Then it was time to fix some problems, add some features like playing videos, a new layout, and so on. I may be the only person in the world left who wants this app, but well, that was fine for a day long project that cooked in a window with some prompts from me while I did other stuff. I’ll probably tackle scantailor advanced next to clean up some terrible book scans. Again, I have real things to do with my time, but each of these mini projects just requires me to have a browser window open to a Claude code instance while I work on more attention demanding tasks.

skrebbel6mo ago

> Ok, so here is an interesting case where Claude was almost good enough, but not quite.

You say that as if that’s uncommon.

jonplackett6mo ago

This should be the strap line for all AI (so far)

1 more reply

egeozcan6mo ago

Side note: As a person who started using a mac since march, I found phoenix slides really good.

smoghat6mo ago

It is! I was really just curious if I could update this old codebase without getting my hands dirty.

mabedan6mo ago

Interesting. I switched to the Mac in 2005, and what I missed the most was the fact that in windows you could double click an image and then tap the left and right keys to browse other photos in the same folder. I learned objective c and made an app for it back then, but never published. I guess the jpegview fulfilled a similar purpose.

pwython6mo ago

I switched to Mac in 2008. I forget if the featured existed back then, but today on macOS if you press spacebar on an image in Finder to preview, you can use the arrow keys to browse other photos.

1 more reply

sqircles6mo ago· 5 in thread

> The Space Jam website is simple: a single HTML page, absolute positioning for every element...

Absolute positioning wasn't available until CSS2 in 1998. This is just a table with crafty use of align, valign, colspan, and rowspan.

thecr0wOP6mo ago

Thanks, my friend. I added a strike through of the error, a correction, and credited you.

I'm keeping it in for now because have made some good jokes about the mistake in the comments and I want to keep that context.

sqircles6mo ago

I thought for sure I was going to see an image map when I looked at the source. Pleasant surprise!

DocTomoe6mo ago

Which would also render differently on every machine, based on browser settings, screen sizes, and available fonts.

Like the web was meant to be. An interpreted hypertext format, not a pixel-perfect brochure for marketing execs.

masswerk6mo ago

Hum, table cells provide the max-width and images a min-with, heights are absolute (with table cells spilling over, as with CCS "overflow-y: visible"), aligns and maybe HSPACE and VSPACE attributes do the rest. As long as images heights exceed the effective line-height and there's no visible text, this should render pixel perfect on any browser then in use. In this case, there's also an absolute width set for the entire table, adding further constraints. Table layouts can be elastic, with constraints or without, but this one should be pretty stable.

(Fun fact, the most amazing layout foot-guns, then: Effective font sizes and line-heights are subject to platform and configuration (e.g., Win vs Mac); Netscape does paragraph spacing at 1.2em, IE at 1em (if this matters, prefer `<br>` over paragraphs); frames dimensions in Netscape are always calculated as integer percentages of window dimensions, even if you provide absolute dimensions in pixels, while IE does what it says on the tin (a rare example), so they will be the same only by chance and effective rounding errors. And, of course, screen gamma is different on Win and Mac, so your colors will always be messed up – aim for a happy medium.)

1 more reply

jeanlucas6mo ago

>Like the web was meant to be.

what?

2 more replies

Wowfunhappy6mo ago· 5 in thread

Claude is not very good at using screenshots. The model may technically be multi-modal, but its strength is clearly in reading text. I'm not surprised it failed here.

fnordpiglet6mo ago

Especially since it decomposes the image into a semantic vector space rather than the actual grid of pixels. Once the image is transformed into patch embeddings all sense of pixels is entirely destroyed. The author demonstrates a profound lack of understanding for how multimodal LLMs function that a simple query of one would elucidate immediately.

The right way to handle this is not to build it grids and whatnot, which all get blown away by the embedding encoding but to instruct it to build image processing tools of its own and to mandate their use in constructing the coordinates required and computing the eccentricity of the pattern etc in code and language space. Doing it this way you can even get it to write assertive tests comparing the original layout to the final among various image processing metrics. This would assuredly work better, take far less time, be more stable on iteration, and fits neatly into how a multimodal agentic programming tool actually functions.

mcbuilder6mo ago

Yeah, this is exactly what I was thinking. LLMs don't have precise geometrical reasoning from images. Having an intuition of how the models work is actually.a defining skill in "prompt engineering"

1 more reply

thecr0wOP6mo ago

Great, thanks for that suggestion!

dcanelhas6mo ago

Even with text, parsing content in 2D seems to be a challenge for every LLM I have interacted with. Try getting a chatbot to make an ascii-art circle with a specific radius and you'll see what I mean.

Wowfunhappy6mo ago

I don't really consider ASCII art to be text. It requires a completely different type of reasoning. A blind person can be understand text if it's read out loud. A blind person really can't understand ASCII art if it's read out loud.

ErrantX6mo ago· 4 in thread

I just feel this is a great example of someone falling into the common trap of treating an LLM like a human.

They are vastly less intelligent than a human and logical leaps that make sense to you make no sense to Claude. It has no concept of aesthetics or of course any vision.

All that said; it got pretty close even with those impediments! (It got worse because the writer tried to force it to act more like a human would)

I think a better approach would be to write a tool to compare screenshots, identity misplaced items and output that as a text finding/failure state. claude will work much better because your dodging the bits that are too interpretive (that humans rock at and LLMs don't)

sallveburrpi6mo ago

> vastly less intelligent than a human

I would more phrase it like that they are a completely alien “intelligence” that cant really be compared to human intelligence

oncallthrow6mo ago

Yes, currently they are both smarter and dumber than humans.

ErrantX6mo ago

I meant that frame very deliberately. Use of the word AI is misleading people that LLMs are intelligent.

They model what looks like intelligence but with very hard limits. The two advantages they have over human brains are perfect recall and data storage. They are also faster.

But the brain is vastly more intelligent:

- It can learn concepts (e.g. language) with an order of magnitude less information

- It responds in parallel to multiple formats of stimuli (e.g. sight/sound)

- LLMs lack the ability to generalise

- The brain interprets and understands what it experienced

That's just the tip of the iceberg. Don't get me wrong: I use AI, it is by far some of the most impressive tech we have built so far, and it has potential to advance society significantly.

But it is definitely, vastly, less intelligent than us.

1 more reply

naet6mo ago

The blog frequently refers to the LLM as "him" instead of "it" which somehow feels disturbing to me.

I love to anthropomorphize things like rocks or plants, but something about doing it to an AI that responds in human like language enters an uncanny valley or otherwise upsets me.

daemonologist6mo ago· 4 in thread

Interesting - these models are all trained to do pixel-level(ish) measurement now, for bounding boxes and such. I wonder if you could railroad it into being accurate with the right prompt.

Lerc6mo ago

What models are good at this? I have tried passing images to models and asking them for coordinates for specific features, then overlaid dots on those points and passed that image back to the model so it has a perception of how far out it was. It had a tendency to be consistently off by a fixed amount without getting closer.

I don't doubt that it is possible eventually, but I haven't had much luck.

Something that seemed to assist was drawing a multi coloured transparent chequerboard, if the AI knows the position of the grid colours it can pick out some relative information from the grid.

daemonologist6mo ago

I've found Qwen3-VL to be fairly accurate at detection (though it doesn't always catch every instance). Note that it gives answers as per-mille-ages, as if the image was 1000x1000 regardless of actual resolution or aspect ratio.

I have also not had luck with any kind of iterative/guess-and-check approach. I assume the models are all trained to one-shot this kind of thing and struggle to generalize to what are effectively relative measurements.

ryoshu6mo ago

I can't do that either without opening up an image editing tool. Give the model a tool and goal with "vision". Should work better.

sdenton46mo ago

Feels like the "right" approach would be to have it write some code to measure how far off the elements are in the original vs recreated image, and then iterate using the numerical output of the program...

jacobsenscott6mo ago· 4 in thread

> here's no other way to do it besides getting Claude to recreate it from a screenshot

And

> I'm an engineering manager

I can't tell if this is an intentional or unintentional satire of the current state of AI mandates from management.

master_crab6mo ago

Honest question: does he know about F5? Or was it intentional to use screenshots when source is available?

1 more reply

chilmers6mo ago

You really can’t tell? Perhaps the bar for AGI is lower than I thought.

thecr0wOP6mo ago

lololol

dmd6mo ago

i can’t tell if your comment is satire or not

isoprophlex6mo ago· 4 in thread

I have a very weird tangential nit to pick: gendering LLMs. I swear I'm not pushing any sort of gender agenda/discussion that can be had anytime anywhere else in the current age, but to me there is something quintessentially a-gendered about the output of a computer program.

Calling Claude (or GPT-5 or Gemini or my bash terminal for that matter) a "he" seems absurd to the point of hilarity.

In my mind, they've always firmly been "it"s.

DocTomoe6mo ago

Hm, Claude is a common male surname, especially in Europe. That plays into it. Also many people - including me - have personalised their AI chats, have given it names, even something resembling a personality (it's easy with prefix prompts). Why others do it, who knows, I do it because I find it a lot less frustrating when ChatGPT fucks up when it pretends to be a young adult female klutz.

retsibsi6mo ago

Does it happen much with non-Claude models? If someone genders ChatGPT, it makes me worry that they're taking the character it's playing too literally. But if someone genders Claude, that seems pretty normal, given that it has a man's name.

bojan6mo ago

This felt quirky to me as well, possibly because my native language is strictly gendered.

fragmede6mo ago

Sounds like the setup for a sexist comedian's routine. "Y'know, ChatGPT is totally a woman because she reminds me of my wife. She thinks it knows everything and is convinced she's right, when she's totally full of shit! And what's the deal with airline food?" Swap the gender depending on your target audience.

In other languages, chairs have a gender, along with other everyday items like scissors and it doesn't especially make logical sense, although you can squint and tell a story as why something is the gender that's been assigned. Thus making the gender of AI simply a matter"that's just how things are".

micromacrofoot6mo ago· 2 in thread

I wouldn't call it entirely defeated, it got maybe 90% of the way there. Before LLMs you couldn't get 50% of the way there in an automated way.

> What he produces

I feel like personifying LLMs more than they currently are is a mistake people make (though humans always do this), they're not entities, they don't know anything. If you treat them too human you might eventually fool yourself a little too much.

thecr0wOP6mo ago

As a couple other comments pointed out, it's also not fair to judge Claude based on a one shot like this. I sort of assume these limitations will remain even if we went back and forth but to be fair, I didn't try that more than a few times in this investigation. Maybe on try three it totally nails it.

micromacrofoot6mo ago

Very true, I would also caution this with test projects with real humans in the hiring process. Comparing one-shots from actual people is unfair too, and often the most valid assessment comes with giving them feedback and seeing how they respond to it.

Aside from that point: if you are reading this and making people do a project as part of the hiring process, you should absolutely be paying them for their time (even a token amount).

stared6mo ago· 2 in thread

Just use Playwright Skill (https://github.com/lackeyjb/playwright-skill). It is a game changer. Otherwise it is Claude the Blind, as OP mentioned.

pcwelder6mo ago

But that's cheating because it then has the source code containing the table and its styles.

I can confirm that this is what it does.

And if you ask it to not use tables, it cleverly uses div with the same layout as the table instead.

drbacon6mo ago

I think the idea is to let Claude see iterations of the reproduction with playwright, but still only allow access to screenshots of the original.

mxfh6mo ago· 2 in thread

Everything feels wrong with that approach too me, starting with calling a perfectly time-appropriate website anachronistic.

Anachronistic would be something like creating an apparent flash website for a fictional 90s internet related movie.

robomc6mo ago

He's using it correctly, in its secondary sense of "belonging or appropriate to an earlier period, especially so as to seem conspicuously old-fashioned or outdated."

mxfh6mo ago

Still not quite convinced that the adjective should be applied to the website itself in a quite loose use of the word.

Warner Bros anachronistically keeps this website online would be a simple fix; here used to reference and to point out that maintaining an untouched 1996 promotional site at it's original location is not typical for the lifecycle of a website, usually the publisher would rather redirect clicks to some current offer.

Othwerwise there is no anachronism here with the website itself, just it's location under the original URL and not in some archive only.

The website itself fulfilled its purpose for promoting the movie when it was released and simply continues to exist.

You wouldn’t call posters, magazines, or other artifacts from the ’90s anachronistic just for still existing. Being retrievable doesn’t make something outdated by itself.

“Anachronistic” would apply only if a new promotional site were created today to look like this—though that would more likely be called “retro.”

Or if the movie industry insisted on using CSS-free table layouts for all its promotional websites, similar to other norms or laws that feel anachronistic because they no longer match current needs.

Sadly the whole piece reads like it was written 80%+ by an LLM too, seriously why all the emojis? But apparently this is where content is heading in general.

anorwell6mo ago· 2 in thread

The article does not say at any point which model was used. This is the most basic important information when talking about the capabilities of a model, and probably belongs in the title.

thecr0wOP6mo ago

Whoops, I'm very dumb. It's Opus 4.1. I updated the blog post and credited you for the correction. Thank you!

dweekly6mo ago

That model does not exist. Do you mean Opus 4.5?

2 more replies

960design6mo ago· 2 in thread

Claude argued with me about the quadratic equation the other day. It vehemently felt a -c was required whereas a c was the correct answer. I pointed this out showing step by step and it finally agreed. I tried Grok to see if it could get it right. Nope, the exact same response as Claude, but Grok never backed down; even after the step by step explanation of the maths.

kristofferc6mo ago

Can you link to the conversation log?

pbhjpbhj6mo ago

In the generalised solution there is a '-c' term with coefficient '4a'...?

I'm not well at the moment, perhaps your ML model has flu?!!

pluc6mo ago· 2 in thread

I like how the author calls a script on the internet "him".

thecr0wOP6mo ago

lol

NooneAtAll36mo ago

better than using a plural for a single entity

3 more replies

manmal6mo ago· 2 in thread

I would put Claude into a loop and let it make screenshots itself, diffing them against the original screenshot, until it has found the right arrangement at the planets‘ starting position (pixel perfect match).

fluidcruft6mo ago

I would try giving it a tool to work with projections and process axis by axis to see if it works better in 1D than 2D. I dunno how clever claude is about signal processing though. There's no noise here so... I mean really it's just template matching without rotation and without noise so... But I doubt claude can do or reason about basic image processing.

epgui6mo ago

At that point you may as well just do the work yourself.

2 more replies

fortyseven6mo ago· 2 in thread

Look at that stupid dog. It's reading a book, but it's really trashy YA. It's not even Shakespeare. Dogs are stupid.

satisfice6mo ago

But the dog's owner keeps saying that it ONLY reads Shakespeare. The dog's owner wants millions of dollars for the dog on that basis.

I'm not mad at the dogs, I'm mad at the stupid investors and the lying owners.

thecr0wOP6mo ago

lol, that is fair criticism

stonecharioteer6mo ago· 2 in thread

I'm curious. Did you ask it to use tables and no CSS?

In 1996, We had only css1. Ask it to use tables to do this, perhaps.

lagniappe6mo ago

We actually had a myriad of competing specs. CSS proper wasn't released until december 1996.

1 more reply

thecr0wOP6mo ago

Yes yes great question!

I tried your suggestion and also tried giving it various more general versions of the limitations presented by earlier generations.

Claude's instinct initially was actually to limit itself to less modern web standards.

Unfortunately, nothing got those planets to be in the right place.

handedness6mo ago· 2 in thread

A site in '96 would have been built largely with tables, not CSS. CSS didn't become a thing until a couple of years later.

I know this because I'm still salty about the transition. For all of CSS's advantages, we lost something when we largely moved away from tables.

chrisco2556mo ago

We lost something for a time but flex box and CSS grid (which came more recently) are objectively better than tables.

1 more reply

yesitcan6mo ago

You can still use tables.

1 more reply

vrighter6mo ago· 2 in thread

"there is no other way to preserve it"

Bullshit. Right click -> view source

Or just press ctrl+s and the browser will also gather all the assets into a folder for you.

The arrogance of thinking that the only way you know how is the only way....

You literally forgot the save feature all browsers have just because you set out to "solve" this using "ai"

OkGoDoIt6mo ago

The statement is almost certainly made in jest, since it is obviously untrue. Sometimes adding silly artificial constraints can be a fun way to spark creativity.

Mashimo6mo ago

Please read the sentence right after the one you quoted. You removed to context to shit on him :(

throwaway3141556mo ago· 2 in thread

Somehow I suspect Claude Code (in an interactive session with trial, error, probing, critiquing, perusing, and all the other benefits you get) would do better. This example seems to assume Claude can do things in "one shot" (even the later attempts all seem to conceal information like it's a homework assignment).

That's not how to successfully use LLM's for coding in my experience. It is however perhaps a good demonstration of Claude's poor spatial reasoning skills. Another good demonstration of this is the twitch.tv/ClaudePlaysPokemon where Claude has been failing to beat pokemon for months now.

CharlesW6mo ago

Using https://github.com/anthropics/claude-code/tree/main/plugins/... with style-supporting instructions and context would've improved the outcome as well.

2 more replies

thecr0wOP6mo ago

Not a homework assignment, and no deliberate attempt to conceal information, just very long and repetitive logs. A lot of the same "insights" so I just didn't provide them here.

> That's not how to successfully use LLM's for coding in my experience.

Yeah agree. I think I was just a little surprised it couldn't one-shot given the simplicity.

docheinestages6mo ago· 2 in thread

> Note: please help, because I'd like to preserve this website forever and there's no other way to do it besides getting Claude to recreate it from a screenshot.

Why not use wget to mirror the website? Unless you're being sarcastic.

$ wget --mirror --convert-links --adjust-extension --page-requisites --no-parent http://example.org

Source: https://superuser.com/questions/970323/using-wget-to-copy-we...

thecr0wOP6mo ago

The stuff about not being able to download it is a bit of a joke and I don't think the tone landed with everybody haha. This was just an experiment to see if Claude could recreate a simple website from a screenshot, of course to your point you could download it if you wanted.

malfist6mo ago

Because that wasn't the goal of this exercise

yosito6mo ago· 1 in thread

This has been my experience with almost everything I've tried to create with generative AI, from apps and websites, to photos and videos, to text and even simple sentences. At first glance, it looks impressive, but as soon as you look closer, you start to notice that everything is actually just sloppy copy.

That being said, sloppy copy can make doing actual work a lot faster if you treat it with the right about of skepticism and hand-holding.

It's first attempt at the Space Jam site was close enough that it probably could have been manually fixed by an experienced developer in less time than in takes to write the next prompt.

jama2116mo ago

but my experience has also been that with every model they require less hand holding and the code is less sloppy. If I’m careful with my prompts, gpt codex 5.1 recently has been making a lot of typescript for me that is basically production ready in a way it couldn’t even 2 months ago

charcircuit6mo ago· 1 in thread

>I'd like to preserve this website forever and there's no other way to do it besides getting Claude to recreate it from a screenshot.

There are other ways such as downloading an archive and the preserving the file in one or more cloud storages.

https://archive.is/download/cXI46.zip

a21286mo ago

The sentence immediately after that would imply sarcasm

> Note: please help, because I'd like to preserve this website forever and there's no other way to do it besides getting Claude to recreate it from a screenshot. Believe me, I'm an engineering manager with a computer science degree. Please please please help (sad emoji)

jonas216mo ago· 1 in thread

There was a response to this post on the front page earlier this morning that was able to get Claude to succeed simply by giving it access to Playwright so it could see what it was doing and telling it in the prompt that it needed to be pixel perfect:

https://news.ycombinator.com/item?id=46193412

As of right now, it seems to have been flagged into oblivion by the anti-AI crowd. I found both posts to be interesting, and it's unfortunate that one of them is missing from the conversation.

thecr0wOP6mo ago

yeah that sucks. I also linked it to the top of my post so folks can take a look.

johnfn6mo ago· 1 in thread

Context is king. The problem is that you are the one currently telling Claude how close it is and what to do next. But if you give it the tools to do that itself, it will make a world of difference.

Give Claude a way to iteratively poke at what it created (such as a playwright harness), and screenshot of what you want, and maybe a way to take a screenshot in Playwright and I think you will get much closer. You might even be able to one shot it.

I’ve always wondered what would happen if I gave it a screenshot and told it to iterate until the Playwright screenshot matched the mock screenshot, pixel perfect. I imagine it would go nuts, but after a few hours I think it would likely get it. (Either that or minor font discrepancies and rounding errors would cause it to give up…)

alexandre_m6mo ago

The key is always feedback loop. If you give the AI the ability to verify itself, then it's able to iterate faster. Sure, it may take many iterations, but at least the iteration spans will be shorter than waiting for a human to validate each time.

I'd be curious to see how Antigravity compares for the same task with its automatic browser agentic validation logic.

pfix6mo ago· 1 in thread

I checked the source of the original (like maybe many of you) to check how they actually did it and it was... simpler than expected. I drilled myself so hard to forget tables as layout... And here it is. So simple it's a marvel.

COAGULOPATH6mo ago

And they do hacky things like space elements vertically using <br> tags.

buchwald6mo ago· 1 in thread

Claude is surprisingly bad at visual understanding. I did a similar thing to OP where I wanted Claude to visually iterate on Storybook components. I found outsourcing the visual check to Playwright in vision mode (as opposed to using the default a11y tree) and Codex for understanding worked best. But overall the idea of a visual inspection loop went nowhere. I blogged about it here: https://solbach.xyz/ai-agent-accessibility-browser-use/

MagMueller6mo ago

Interesting read. Agree that GUI is super hard for agents. Did you see "skills" from browser-use? We directly interact with network requests now.

torginus6mo ago· 1 in thread

Not sure how good Claude is nowadays, but I remember using Claude 3.5 to do some fiction writing and for a while I thought it was amazing at coming up with plots, setting ideas, writing witty dialogue - then after a short while I noticed it kept recycling the same ideas, phrases etc, quickly becoming derivative, and having 'tells', similar to the group of 3 quirk, with some otherwise decent writing patterns showing up with great frequency.

I've heard the same thing about it doing frontends - it produces gorgeous websites but it has similar 'tells', it does CSS and certain features the same way, and if you have a very concrete idea of what you want out of it, you'll end up fighting an uphill battle with it constantly trying to do things its own way.

Which is part of the 'LLM illusion' - I guess. To an unskilled individual, or when starting from scratch, it seems great, but the more complex the project gets, the harder it becomes to have it contribute meaningfully, leading to an ever-mounting frustration, and eventually me just giving up and doing it by hand.

tyre6mo ago

My boy loves a neon gradient.

To be fair, a lot of startup websites look very similar. And the number of Stripe blurples out there was a pre-LLM brand meme.

shortformblog6mo ago· 1 in thread

Claude can't properly count the number of characters in a sentence. It's asking a lot to assume it can get pixel perfect.

hu36mo ago

LLMs can, for example, write a script to calculate all permutations of the levenshtein distances between words in a sentence and deliver it before the average programmer even understands what this means. Trivialy.

You're just holding it wrong if you're asking it to simply count the characters in a sentence.

liampulles6mo ago· 1 in thread

It seems to me that Claude's error here (which is not unique to it) is self-sycophancy. The model is too eager to convince itself it did a good job.

I'd be curious to hear from experienced agent users if there is some AGENTS.md stuff to make the LLM more clear speaking? I wonder if that would impact the quality of work.

aprilfoo6mo ago

> It seems to me that Claude's error here (which is not unique to it) is self-sycophancy. The model is too eager to convince itself it did a good job.

It seems this applies to the whole AI industry, not just LLMs.

p0w3n3d6mo ago· 1 in thread

LLM stands for large LANGUAGE models, so I guess you could succeed if you had a correct LANGUAGE. Maybe radial coordinates? Or turtle graphics? I myself tried to generate an SVG with twelve radial dots as in a clock in chatgpt, and failed (a year ago). Now I think it would succeed, however still the question is does it succeed because people trained it to do so.

Also I have noticed that AI generates things close to what you want, and it sticks really hard to that "close" qualifier, not wanting to cross any borders to get too close, so I'd be happy with the effect you have shown, as it is what AI does

vghaisas6mo ago

You might enjoy these AI generated clocks: https://clocks.brianmoore.com/

victorbuilds6mo ago· 1 in thread

Building something similar - using Claude API to generate mini games from text descriptions (https://codorex.com, still pretty rough).

Can confirm: Claude is weirdly good at generating functional game logic from vague prompts, but spatial precision is a constant battle. Anything involving exact pixel positions needs validation/correction layers on top.

The suggestion upthread about having it write its own measurement tools seems promising - haven't tried that approach yet.

victorbuilds6mo ago

Actually, here's a relevant example - had Claude generate an interactive solar system with circular orbits and moons: https://codorex.com/shared/yeABdJWvRHAKqHs2kxpRnZNZPWmqWSu9

Interesting that it handles this fine (functional orbital mechanics, animation) but would probably struggle to recreate the exact pixel positions of the Space Jam layout. Confirms the pattern: good at "make something like X" but bad at "recreate X exactly."

1 more reply

Aeolun6mo ago· 1 in thread

I think claude could have easily used a script to calculate the positions of the planets exactly here, instead of trying to use the frankly horrible image recognition.

fluidcruft6mo ago

Claude could probably have written a script to do it. I'm hunting assets to confirm.

Edit: yup very easy for claude. Most of the wall time was spent installing cv2 in a venv (uv dropped python 3.14 on claude, which seemed to get stuck compiling numpy or something so I interrupted and told claude to just use python 3.13 instead).

ME: In this directory there are two images: @p-pressbox.gif and @screenshot.png. @p-pressbox.gif is an image that exists somewhere inside @screenshot.png Write a script that finds the precise location

CLAUDE: Looking at the images, I can see that p-pressbox.gif (the spaceship with "PRESS BOX SHUTTLE" text) appears in the upper-left area of screenshot.png. I'll write a Python script using OpenCV to find the precise location.

[ some venv stuff/telling claude to use uv and python 3.13 ]

CLAUDE: The script found the precise location of p-pressbox.gif within screenshot.png:

  | Measurement         | Value           |
  |---------------------|-----------------|
  | Top-left corner     | (87, 110)       |
  | Bottom-right corner | (218, 166)      |
  | Center              | (152, 138)      |
  | Template size       | 131 × 56 pixels |
  | Match confidence    | 97.2%           |

  The script is at /tmp/search/find_image.py and uses OpenCV's template matching with normalized cross-correlation.

ME> As part of the output create a visualization of the @p-pressbox.gif positioned on top of the original screenshot

CLAUDE: Done! The visualization shows a green rectangle highlighting the exact location where p-pressbox.gif was found within the screenshot. The match is at coordinates (87, 110) to (218, 166).

I confirmed the result was correct.

1 more reply

city176mo ago· 1 in thread

Lovable was able to one-shot it (except for the top margin): https://spacejam-pixel-perfect.lovable.app/

All I did was give Lovable a single prompt: "with pixel perfect accuracy recreate this website: https://www.spacejam.com/1996/"

Edit: Commenter below may be right.

pcwelder6mo ago

It just fetched the HTML and replicated it. The usage of table is a giveaway.

Any LLM with browser tool can do it (Kombai one shots it too for example), because it's just cheating.

1 more reply

simonw6mo ago· 1 in thread

I wonder if Gemini 3 Pro would do better at this particular test? They're very proud of its spatial awareness and vision abilities.

music4airports6mo ago

>They're very proud of its spatial awareness and vision abilities.

Suuuuuuuuure they are.

I haven't found a single multimodal model, vision LLM, or any model at all that can segment and extract music charts/infographics.

Can Gemini 3 Pro, in one shot, turn charts like these into lists of "artist - album" without choking on the visuals?

https://reddit.com/r/citypop/comments/10fu1t5/city_pop_album...

https://reddit.com/r/indieheads/comments/173o33z/the_new_ind...

1 more reply

rickcarlino6mo ago· 1 in thread

I look forward to an alternative reality where AI vendors race to have the model with the best Space Jam Bench scores.

jacquesm6mo ago

It beats pelicans by a considerable margin.

syassami6mo ago· 1 in thread

We've lost the capability to build such marvels.

https://knowyourmeme.com/memes/my-father-in-law-is-a-builder...

barfoure6mo ago

Lost it at wooden structures at the playground.

mr_windfrog6mo ago· 1 in thread

Maybe we could try asking Claude to generate code using <table>, <tr>, <td> for layout instead of relying on div + CSS. Feels like it could simplify things a lot.

Would this actually work, or am I missing something?

thecr0wOP6mo ago

I think it probably gets you 80% but the last 20% of pixel perfection seems to evade Claude. But I'm pretty new to writing prompts so if you can nail it let me know and I'll link you in the post.

RagnarD6mo ago· 1 in thread

Why not just feed it the actual instructions that create the site - the page source code, the HTML, CSS, JS if any?

masswerk6mo ago

This may be an even greater challenge: analysing a table layout and recreating it in CSS with absolute positioning.

subleq6mo ago· 1 in thread

What if you gave it an image comparison tool that would xor two screenshots to check its work?

poly2it6mo ago

I've tried doing stuff similar to the author, and it generally does not get better after the first attempt. I even tried supplying Claude with a delta view, ie. the difference in per-pixel output, along with the reference and current result, yet it was impossible for it to understand and remember the actual differences.

QuadrupleA6mo ago· 1 in thread

Wasn't it now (end of 2025) that Dario Amodei said Claude (or LLMs in general) would be doing almost all programming work?

This article is my typical experience with LLM coding. Endless correction and handholding, and manual cleanup of subtle mistakes. With no long-term learning from them.

Kinda makes me livid, the amount of false hype coming out of the mouths of the stewards of these investor-subsidized LLM companies.

But they're amazing Google replacements, and learning tools. And once in a blue moon they ace a coding assignment and delight me.

Edit: 90% of coding work by June to September 2025: https://www.businessinsider.com/anthropic-ceo-ai-90-percent-...

pwython6mo ago

I mean, Claude got 90% there. The author could contribute 10% of work to fix the alignment and have a finished page.

1 more reply

supern0va6mo ago· 1 in thread

Honestly, if you had showed this article to me even eighteen months ago, I would have been blown away at how good of a job Claude did.

It's remarkable how high our expectations have been steadily creeping.

WhyOhWhyQ6mo ago

This comment is missing the point. The real goal of all this is not to amaze. It's to create better software. Let's graduate past the amazement phase into the realism phase as soon as possible. What parts of my project is the LLM for? That is the real question worth asking.

1 more reply

thenumpaduser6mo ago· 1 in thread

We are actually spoiled at this point.

echelon6mo ago

[flagged]

1 more reply

a-dub6mo ago· 1 in thread

maybe ask it to use 1990s table based layout approaches?

al_borland6mo ago

Interesting. I just looked at the page source and it is in fact using a table layout. I always assumed it was an image map, which I assume would be even more obscure for the LLM.

1 more reply

Tokkemon6mo ago· 1 in thread

Why do I feel like the old man yelling at clouds that programmers refuse to use their brains anymore?

mrwaffle6mo ago

It's funny, the joke that good programmers are lazy, has always held until now.

computersuck6mo ago· 1 in thread

this is just AI brainrot disease

Help, I can't recreate a website with AI! There's no other way, no way I could fix up some HTML code! Believe me, I'm an engineering manager with a computer science degree!

Absolutely disgusting.

derektank6mo ago

I’m pretty confident the author was being sarcastic

th0ma56mo ago· 1 in thread

I personally don't understand why asking these things to do things we know they can't do is supposed to be productive. Maybe for getting around restrictions or fuzzing... I don't see it as an effective benchmark unless it can link directly to the ways the models are being improved, but, to look at random results that sometimes are valid and think more iterations of randomness will eventually give way to control is a maddening perspective to me, but perhaps I need better language to describe this.

thecr0wOP6mo ago

I think this is a reasonable take. I think for me, I like to investigate limitations like this in order to understand where the boundaries are. Claude isn't impossibly bad at analyzing images. It's just pixel perfect corrections that seem to be a limitation. Maybe for some folks it's enough to just read that but for me, I like to feel like I have some good experiential knowledge about the limitations that I can keep in my brain and apply appropriately in the future.

1 more reply

dreadnip6mo ago· 1 in thread

Why involve an LLM in this? Just download the site?

ChrisArchitect6mo ago

Yeah, Internet Archive has lots of copies https://web.archive.org/web/20250000000000*/https://www.spac... also

What's with the panicked pleas and need to preserve the site, assuming locally...?

1 more reply

johncoatesdev6mo ago· 1 in thread

You last-minute cancelled coffee with your friends to work on this? I'm not sure how I would feel if a friend did that to me.

fishtoaster6mo ago

Based on the later life updates, I suspect this was being humorous.

> After these zoom attempts, I didn't have any new moves left. I was being evicted. The bank repo'd my car. So I wrapped it there.

soared6mo ago

I got quite close with Gemini 3 pro in AI studio. I uploaded a screenshot (no assets) and the results were similar to OP. It failed to follow my fix initially but I told it to follow my directions (lol) and it came quite close (though portrait mode distorted it, landscape was close to perfect.

“Reference the original uploaded image. Between each image in the clock face, create lines to each other image. Measure each line. Now follow that same process on the app we’ve created, and adjust the locations of each image until all measurements align exactly.”

https://aistudio.google.com/app/prompts?state=%7B%22ids%22:%...

PaulHoule6mo ago

(1) Multi-modal is where a lot of these things go to die. You will hear people talk about the occasional striking success but so often I show Copilot an easily identifiable flower image and it gets it wrong even though Google Lens will get it right

(2) The kind of dialog he's having with Claude is a kind of communication pattern I've found never works with LLMs. Sure there is the kind of conversation that goes

   Do X

   ... that's pretty good except for Y

   Great!

but if it is

   Do X

and it comes back with something entirely wrong I'd assume the state of the thing is corrupted and it is never coming back and no matter how you interrogate it, encourage it, advise it, threaten it, whatever, you will go in circles.

kaijia6mo ago

I almost did it with the plain claude code on a $20 plan, without any skills, by leveraging my domain expertise with the following prompt:

  Your goal is to create an HTML5 website that is pixel-accurate as the screenshot @Space-Jam.png . You can use the image files in the `img` folder. To achieve pixel-level reproduction, first write auxiliary programs to measure the pixel locations of the objects on this page. Then use your measurements to create the page. Consider using feature detection/matching with per-object (x, y) offsets and a global scale factor shared by all objects as the free parameters. Consider using RANSAC for robust estimation. After obtaining an initial estimation, crop the image of each object and use local sparse optical flow for refinement. Use JPG format for visual inspection of intermediate results. Your webpage should be able to scale with the window size.

Note: the footer text size/locations are off. We can probably fix that by explicitly asking cc to write scripts for text bounding box detection.

More details here: https://gist.github.com/jia-kai/802de63816711d67f0a090fa267a...

neuroelectron6mo ago

My web-dev friend saw the original Space Jam site. I asked him what it would cost to build something like that today. He paused and said:

We can’t. We don’t know how to do it.

stwsk6mo ago

>Look, I still need this Space Jam website recreated.

Now that's a novel sentence if I've ever read one.

jdironman6mo ago

I am going to give this a shot, but using a method I have been using lately with subagents. Basically, what I do is have it create an Architect, Executor, Adjudicator subagents. Architect breaks any ask down into atomic and testable subtasks that take 1-3 minutes 'dev' time. Executor (can spawn more than one) implements them. Then adjudicator reviews that they are to spec / requirements. This all happens in subagent files + a runbook.json in the .claude folder of a project. Its based on a paper* that was featured on here a while back actually [1].

[1] https://arxiv.org/abs/2511.09030

voodooEntity6mo ago

THanks for sharing this. Partly because i forgot about this great website :D also because i would never thought of giving this as an LLM task because its so simple that i prolly just had hacked it down myself :D

I recently experimented alot with agentic coding (mostly with gemini+ intellij plugin, copilot intellij plugin and intellij's own junie) and also condsidered to give it a try and feed images to the AI, but than all tasks i tried so far were pure backend-ish so it never came to the point.

Im really curious how especially junie will act and i will give it a try with the very same task you gave it. We gonne see how it ends :D

DocTomoe6mo ago

I keep wondering ... is this a good benchmark? What is a practical use-case for the skills Claude is supposed to present here? And if the author needs that particular website re-created with pixel-perfect accuracy, woulnd't it me simpler to just to it yourself?

Sure, you can argue this is some sort of modern ACID-Test - but the ACID tests checked for real-world use-cases. This feels more like 'I have this one very specific request, the machine doesn't perfectly fullfill it, so the machine is at fault.'. Complaining from a high pedestal.

I'm more surprised at how close Claude got in its reimagined SpaceJam-site.

dmitrygr6mo ago

Them: AI will take jobs.

The AI: https://chatgpt.com/share/6923df03-7304-8010-bd08-cd335f0ee9...

Madmallard6mo ago

I wrote a 20,000 line multiplayer battle-arena game in XNA back in 2015 with manually coded physics (so everything is there in the code) and have tried several times with Claude, Gemini, Grok, DeepSeek, and GPT to translate it to JavaScript.

They all fail massively 100% of the time. Even if I break it down into chunks once they get to the chunks that matter the most (i.e. physics, collision detection and resolution, event handling and game logic) they all break down horribly and no amount of prompting back and forth will fix it.

nickdothutton6mo ago

I have recently been working on something "fun" in the terminal that mingles plain ASCII, ANSI "graphics", actual bitmaps (Sixel), and Nerdfonts in a TUI framework (Charm etc). After a week of smashing Claude's head against a wall, which is better than smashing my own, I've had to significantly alter my hopes and expectations.

ajasmin6mo ago

I'm actually surprised Claude was about to do that much.

I hadn't even considered handing it a visual mockup to work from. Event though that workflow is par for the course for any web design team.

I would assume there must be at least some prior work into locating individual assets in a larger canvas. It just needs to be integrated into the pipeline.

ettsvensktlogin6mo ago

This was very interesting. I've tried to create an "agent" Claude Code based system to generate design from screenshots, using Playwright and other tools to take screenshots for iterative improvements. So far I have failed despite weeks of struggles.

Thanks to this post I now have a deeper understanding as to why. Thank you.

vmg126mo ago

We don't know how to build it anymore

1970-01-016mo ago

This is a great under the radar test for AI. I would put money on it failing to recreate the majority of 90s movie websites as it wasn't trained on them. The old cowboy webmasters that built and ultimately abandoned them didn't write many books on the topic.

tehjoker6mo ago

Hmm you note that the problem is the LLM doesn’t have enough image context, but then zoom the image more?

Why not downscale the image and feed it as a second input so that entire planets fit into a patch and instruct it to use the doensampled image for coarse coordinate estimation

hodder6mo ago

This is spot on to my experience vibe coding. You can get pretty good scaffolding but tinkering with details is a nightmare. That is when you need to step in and take over yourself or you will burn all the time saved from the the scaffolding speed up.

badlogic6mo ago

Loved the fun write up. Now that we know that LLM-based vision is lossy, here's a different challenge:

Give the LLM access to the site's DOM and let it recreate the site with modern CSS. LLMs are much better with source code, aka text, right? :)

BiteCode_dev6mo ago

Tell claude to put the screenshot as an centered image with the body having the starry background on repeat. Then define the links as boxes over each icons with an old little tech trick called an image map.

Common at the time before flash took over.

taf26mo ago

In my experience codex has been better at details like this. But who knows working with an llm much like another engineer is all about how you ask and then how you iterate with the llm.

bdcravens6mo ago

A comparison would Codex would be good. I haven't done it with Codex, but when working through problems using ChatGPT, it does a great job when given screenshots.

iwontberude6mo ago

Apropos given Warner Brothers Discovery just sold to Netflix

succo6mo ago

Why not ask Claude to recreate the website by scraping it?

kvirani6mo ago

Sibling post here: https://news.ycombinator.com/item?id=46193412

theahura6mo ago

I was able to get Claude to do this, though it kinda sorta cheated. Blog post describing the output here: https://theahura.substack.com/p/i-successfully-recreated-the...

TLDR:

"The plan is designed to ‘autoformalize’ the problem by using Test Driven Development (TDD). TDD is incredibly important for getting good outputs from a coding agent, because it helps solve the context rot problem. Specifically, if you can write a good test when the model is most ‘lucid’, it will have an easier time later on because it is just solving the test instead of ‘building a feature’ or whatever high dimensional ask you originally gave it.

From here, Nori chugged away for the better part of half an hour in yolo mode while I went to do other things. And eventually I got a little pop up notification saying that it was done. It had written a playwright test that would open an html file, screenshot it, diff it with the original screenshot, and output the final result...

After trying a few ways to get the stars to line up perfectly, it just gave up and copied the screenshot in as the background image, then overlaid the rest of the HTML elements on top.

I’m tempted to give this a pass for a few reasons.

This obviously covers the original use case that tripped up Jonah.

It also is basically exactly what I asked the model to do — that is, give me a pixel perfect representation — so it’s kind of my fault that I was not clearer.

I’m not sure the model actually can get to pixel perfect any other way. The screengrab has artifacts. After all, I basically just used the default linux screenshot selection tool to get the original output, without even paying much attention to the width of the image.

If you ask the model to loosen the requirements for the exact screengrab, it does the right thing, but the pixel alignment is slightly off. The model included this as index_tiled.html in the repo, and you can see the pixel diff in one of the output images..."

grouchy6mo ago

Has anyone found techniques to improve Claude's spatial/layout understanding?

It seems Nano Banana Pro can understand the layout/spatial well.

geldedus6mo ago

AI-assisted coding is a learned skill. You'll have to do it right, with the right approach and the right tools.

sema4hacker6mo ago

> The total payload is under 200KB.

Just out of curiosity, how big was what you considered Claude's best attempt to be?

system26mo ago

Use Claude for Python. That's it. Don't push it for the frontend, it won't do well.

hestefisk6mo ago

Would be interesting to see whether Gemini could crack this problem.

zitterbewegung6mo ago

In actual workflows someone would accept a very close reproduction and fix the small issues. Generally I use systems to get close enough to a scaffolding and / or make small incremental improvements and direct its design

baq6mo ago

new LLM benchmark just dropped. 'draw an svg of a pelican riding a bicycle browsing spacejam 1996 on 640x480 ie6'.

yieldcrv6mo ago

Skill issue

computersuck6mo ago

Why not just host a copy from waybackmachine?

phplovesong6mo ago

This basically boils down to AI being unable to "center a div". I see this very often, AI generated slop is has LOTS of "off by one" kind of bugs.

j / k navigate · click thread line to collapse

460 comments

242 comments · 88 top-level

thuttinger6mo ago· 21 in thread

- Read an .icc file from disk

- parsed the file and extracted the VCGT (video card gamma table)

- wrote the VCGT to the video card for a specified display via amdgpu driver APIs

The only thing I had to fix was the ICC parsing, where it would parse header strings in the wrong byte-order (they are big-endian).

jacquesm6mo ago

mlinsey6mo ago

Certainly if a human wrote code that solved this problem, and a second human copied and tweaked it slightly for their use case, we would have a word for them.

What if it were ten different humans writing ten different-but-related pieces of code, and an eleventh human piecing them together? What if it were 1,000 different humans?

6 more replies

bsaul6mo ago

That's an interesting hypothesis : that LLM are fundamentally unable to produce original code.

7 more replies

ekropotin6mo ago

> If a human did this we probably would have a word for them.

What do you mean? The programmers work is literally combining the existing patterns into solutions for problems.

Mtinie6mo ago

> If a human did this we probably would have a word for them.

I don’t think it’s fair to call someone who used Stack Overflow to find a similar answer with samples of code to copy to their project an asshole.

3 more replies

Aeolun6mo ago

Software engineer? You think I cite all the code I’ve ever seen before when I reproduce it? That I even remember where it comes from?

1 more reply

ineedasername6mo ago

>we probably would have a word for them

raincole6mo ago

This is why ragebait is chosen as the word of 2025.

> took that code without credit to the original author(s), adapted it to your use case

Aka software engineering.

fooker6mo ago

> If a human did this we probably would have a word for them.

Humans do this all the time.

FanaHOVA6mo ago

Are you saying that every piece of code you have ever written contains a full source list of every piece of code you previously read to learn specific languages, patterns, etc?

Or are you saying that every piece of code you ever wrote was 100% original and not adapted from any previous codebase you ever worked in or any book / reference you ever read?

2 more replies

martin-t6mo ago

Programmers are willingly blind to this, at least until it's their code being stolen or they lose their job.

_LLMs are lossily compressed archives of stolen code_.

[0]: http://prize.hutter1.net/

[1]: Oh, god, this phrase is already triggering my generated-by-LLM senses.

[2]: Model of what? Of the stolen text. If 99.9999% of the work to achieve AI wasn't done by people whose work was stolen, they wouldn't be called models.

FeepingCreature6mo ago

This is not how LLMs work.

giancarlostoro6mo ago

You mean like copying and pasting code from Stack Overflow?

nvllsvm6mo ago

> Someone else did

Who?

kevinsync6mo ago

Not to mention the moral vagaries of "if you use a library, is the complete articulation of your thing actually 100% your code?"

Is there a difference between loading and using a function from ImageMagick, and a standalone copycat function that mimics a function from ImageMagick?

What if you need it transliterated from one language to another?

If you learn an algorithm in some university class and then just write it again later, is that code yours? What if your code is 1-for-1 a copy of the code you were taught?

It gets very murky very quick!

I don't know. I guess at the end of the day, does the end justify the means? Feels pretty subjective!

[0] https://en.wikipedia.org/wiki/Amen_break

1 more reply

idiotsecant6mo ago

Yes, the word for that is software developer.

littlecranky676mo ago

> Claude/LLMs in general are still pretty bad at the intricate details of layouts and visual things

Wowfunhappy6mo ago

2 more replies

ubercow136mo ago

Why wouldn't they be?

1 more reply

chongli6mo ago

hedgehog6mo ago

1 more reply

wilsmex6mo ago· 18 in thread

Well this was interesting. As someone who was actually building similar website in the late 90's I threw this into the Opus 4.5. Note the original author is wrong about the original site however:

"The Space Jam website is simple: a single HTML page, absolute positioning for every element, and a tiling starfield GIF background.".

This is not true, the site is built using tables, not positioning at all, CSS wasn't a thing back then...

Here was its one-shot attempt at building the same type of layout (table based) with a screenshot and assets as input: https://i.imgur.com/fhdOLwP.png

thecr0wOP6mo ago

Thanks, my friend. I added a strike through of the error, a correction, and credited you.

I'm keeping it in for now because people have made some good jokes about the mistake in the comments and I want to keep that context.

wilsmex6mo ago

1 more reply

govping6mo ago

yard20106mo ago

Ah! I see the problem now! AI can't see shit, it's a statistical model not some form of human. It uses words, so like humans, it can say every shit it wants and it's true until you find out.

The number one rule of the internet is don't believe anything you read. This rule was lost in history unfortunately.

3 more replies

manbash6mo ago

Ah, those days, where you would slice your designs and export them to tables.

chrisweekly6mo ago

I remember building really complex layouts w nested tables, and learning the hard way that going beyond 6 levels of nesting caused serious rendering performance problems in Netscape.

2 more replies

reconnecting6mo ago

Why not! We did this in 2024 for our website (1) to have zero CSS.

Still works, only Claude can not understand what those tables means.

1. https://www.tirreno.com

3 more replies

thecr0wOP6mo ago

I learned recently that this is still how a lot of email html get generated.

2 more replies

ralferoo6mo ago

It was relatively OK to deal with when the pages were created by coders themselves.

Definitely glad those days are well behind us now!

1 more reply

mmanfrin6mo ago

I yearn for those days. CSS was a mistake. Tables and DHTML is all one needs.

2 more replies

gregoryl6mo ago

Gosh, there was a website, where you submit a PSD + payment, and they spit out a sliced design. Initially tables, later, CSS. Life saver.

1 more reply

jweir6mo ago

And use a single px invisible gif to move things around.

But was Space Jam using multiple images or just one large image with and image map for links?

1 more reply

bigbuppo6mo ago

Oh man, Photoshop still has the slice feature and it makes the most horrendous table-based layout possible. It's beautiful.

alt2276mo ago

M4v3R6mo ago

Please reach out to your nearest government official to tell them what do you think about the Imgur not working in your country.

2 more replies

master-lincoln6mo ago

Which one could be used so everybody can read it? So many different autocratic systems to think about...

I think it's easier if you adapt and get a VPN or a new government.

1 more reply

Aeolos6mo ago

Why is that?

1 more reply

johnebgd6mo ago

I cut my teeth developing for the web using GoLive and will never forget how they used tables to layout a page from that tool…

9999000009996mo ago· 13 in thread

Space Jam website design as an LLM benchmark.

This article is a bit negative. Claude gets close , it just can't get the order right which is something OP can manually fix.

I prefer GitHub Copilot because it's cheaper and integrates with GitHub directly. I'll have times where it'll get it right, and times when I have to try 3 or 4 times.

GeoAtreides6mo ago

>which is something OP can manually fix

alickz6mo ago

>what if the LLM gets something wrong that the operator (a junior dev perhaps) doesn't even know it's wrong?

the same thing that always happens if a dev gets something wrong without even knowing it's wrong - either code review/QA catches it, or the user does, and a ticket is created

>if it fails here, it will fail with other things, in not such obvious ways.

is infallibility a realistic expectation of a software tool or its operator?

1 more reply

godelski6mo ago

I think that's the main problem with them. It is hard to figure out when they're wrong.

[0] I frequently see this in a thing it thinks is a problem but actually isn't, which makes steering more difficult.

smallnix6mo ago

That's not the point of the article. It's about Claude/LLM being overconfident in recreating pixel perfect.

jacquesm6mo ago

I've tried these tools a number of times and spent a good bit of effort on learning to maximize the return. By the time you know what prompt to write you've solved the problem yourself.

bigstrat20036mo ago

> it just can't get the order right which is something OP can manually fix.

If the tool needs you to check up on it and fix its work, it's a bad tool.

markbao6mo ago

“Bad” seems extreme. The only way to pass the litmus test you’ve described is for a tool to be 100% perfect, so then the graph looks like 99.99% “bad tool” until it reaches 100% perfection.

It’s not that binary imo. It can still be extremely useful and save a ton of time if it does 90% of the work and you fix the last 10%. Hardly a bad tool.

It’s only a bad tool if you spent more time fixing the results than building it yourself, which sometimes used to be the case for LLMs but is happening less and less as they get more capable.

1 more reply

godelski6mo ago

I wouldn't go that far, but I do believe good tool design tries to make its failure modes obvious. I like to think of it similar to encryption: hard to do, easy to verify.

[0] https://www.youtube.com/watch?v=5zpLOn-KJSE

mrweasel6mo ago

wvenable6mo ago

Perfection is the enemy of good.

thecr0wOP6mo ago

ya, this is true. Another commenter also pointed out that my intention was to one-shot. I didn't really go too deeply into trying to try multiple iterations.

This is also fairly contrived, you know? It's not a realistic limitation to rebuild HTML from a screenshot because of course if I have the website loaded I can just download the HTML.

swatcoder6mo ago

> rebuild HTML from a screenshot

???

This is precisely the workflow when a traditional graphic designer mocks up a web/app design, which still happens all the time.

There's nothing contrived at all about this test and it's a really great demonstration of how tools like Claude don't take naturally to this important task yet.

1 more reply

Retric6mo ago

It’s not unrealistic to want to revert to an early version of something you only have a screenshot of.

sigseg1v6mo ago· 9 in thread

Curious if you've tested something such as:

Aurornis6mo ago

jazzyjackson6mo ago

jacquesm6mo ago

It would not be the first time that an IT services provider makes more money the worse their products perform.

1 more reply

thecr0wOP6mo ago

Hm, I didn't try exactly this, but I probably should!

cfbradford6mo ago

1 more reply

yfontana6mo ago

bluedino6mo ago

Congratulations, we finally created 'plain English' programming languages. It only took 1/10th of the worlds electricity and 40% of the semiconductor production.

turnsout6mo ago

Yes, this is a key step when working with an agent—if they're able to check their work, they can iterate pretty quickly. If you're in the loop, something is wrong.

That said, I love this project. haha

monsieurbanana6mo ago

I'm trying to understand why this comment got downvoted. My best guess is that "if you're in the loop, something is wrong" is interpreted as there should be no human involvement at all.

1 more reply

manlymuppet6mo ago· 7 in thread

Couldn’t you just feed Claude all the raw, inspect element HTML from the website and have it “decrypt” that?

The entire website is fairly small so this seems feasible.

Usually there’s a big difference between a website’s final code and its source code because of post processing but that seems like a totally solvable Claude problem.

Perhaps the best method is to re-create, rather than replicate the design.

blks6mo ago

What do you mean? Raw html is the original website source code.

Modern web development completely poisoned young generation

manlymuppet6mo ago

Perhaps the wrong usage of "source code". I probably should've been more precise. Forgive my lack of vocabulary to describe the difference I was referring to.

2 more replies

personjerry6mo ago

If you have the raw HTML why would you need to do this at all?

manlymuppet6mo ago

I should've been more precise with my words.

What I meant is doing inspect element on the Space Jam website, and doing select all + copy.

1 more reply

manlymuppet6mo ago

https://pastebin.com/raw/F2jxZTeJ

The HTML I'm referring to, copied from the website.

Only about 7,000 characters or just 2,000 Claude tokens. This is feasible.

valleyer6mo ago

There's actually a language model developed by The Open Group that will transform that back into the original source code.

<https://pubs.opengroup.org/onlinepubs/9799919799/utilities/c...>

2 more replies

literalAardvark6mo ago

The space jam website used HTML tables for formatting and split images in each cell.

CSS didn't exist.

smoghat6mo ago· 6 in thread

skrebbel6mo ago

> Ok, so here is an interesting case where Claude was almost good enough, but not quite.

You say that as if that’s uncommon.

jonplackett6mo ago

This should be the strap line for all AI (so far)

1 more reply

egeozcan6mo ago

Side note: As a person who started using a mac since march, I found phoenix slides really good.

smoghat6mo ago

It is! I was really just curious if I could update this old codebase without getting my hands dirty.

mabedan6mo ago

pwython6mo ago

I switched to Mac in 2008. I forget if the featured existed back then, but today on macOS if you press spacebar on an image in Finder to preview, you can use the arrow keys to browse other photos.

1 more reply

sqircles6mo ago· 5 in thread

> The Space Jam website is simple: a single HTML page, absolute positioning for every element...

Absolute positioning wasn't available until CSS2 in 1998. This is just a table with crafty use of align, valign, colspan, and rowspan.

thecr0wOP6mo ago

Thanks, my friend. I added a strike through of the error, a correction, and credited you.

I'm keeping it in for now because have made some good jokes about the mistake in the comments and I want to keep that context.

sqircles6mo ago

I thought for sure I was going to see an image map when I looked at the source. Pleasant surprise!

DocTomoe6mo ago

Which would also render differently on every machine, based on browser settings, screen sizes, and available fonts.

Like the web was meant to be. An interpreted hypertext format, not a pixel-perfect brochure for marketing execs.

masswerk6mo ago

1 more reply

jeanlucas6mo ago

>Like the web was meant to be.

what?

2 more replies

Wowfunhappy6mo ago· 5 in thread

Claude is not very good at using screenshots. The model may technically be multi-modal, but its strength is clearly in reading text. I'm not surprised it failed here.

fnordpiglet6mo ago

mcbuilder6mo ago

Yeah, this is exactly what I was thinking. LLMs don't have precise geometrical reasoning from images. Having an intuition of how the models work is actually.a defining skill in "prompt engineering"

1 more reply

thecr0wOP6mo ago

Great, thanks for that suggestion!

dcanelhas6mo ago

Wowfunhappy6mo ago

ErrantX6mo ago· 4 in thread

I just feel this is a great example of someone falling into the common trap of treating an LLM like a human.

They are vastly less intelligent than a human and logical leaps that make sense to you make no sense to Claude. It has no concept of aesthetics or of course any vision.

All that said; it got pretty close even with those impediments! (It got worse because the writer tried to force it to act more like a human would)

sallveburrpi6mo ago

> vastly less intelligent than a human

I would more phrase it like that they are a completely alien “intelligence” that cant really be compared to human intelligence

oncallthrow6mo ago

Yes, currently they are both smarter and dumber than humans.

ErrantX6mo ago

I meant that frame very deliberately. Use of the word AI is misleading people that LLMs are intelligent.

They model what looks like intelligence but with very hard limits. The two advantages they have over human brains are perfect recall and data storage. They are also faster.

But the brain is vastly more intelligent:

- It can learn concepts (e.g. language) with an order of magnitude less information

- It responds in parallel to multiple formats of stimuli (e.g. sight/sound)

- LLMs lack the ability to generalise

- The brain interprets and understands what it experienced

That's just the tip of the iceberg. Don't get me wrong: I use AI, it is by far some of the most impressive tech we have built so far, and it has potential to advance society significantly.

But it is definitely, vastly, less intelligent than us.

1 more reply

naet6mo ago

The blog frequently refers to the LLM as "him" instead of "it" which somehow feels disturbing to me.

I love to anthropomorphize things like rocks or plants, but something about doing it to an AI that responds in human like language enters an uncanny valley or otherwise upsets me.

daemonologist6mo ago· 4 in thread

Interesting - these models are all trained to do pixel-level(ish) measurement now, for bounding boxes and such. I wonder if you could railroad it into being accurate with the right prompt.

Lerc6mo ago

I don't doubt that it is possible eventually, but I haven't had much luck.

Something that seemed to assist was drawing a multi coloured transparent chequerboard, if the AI knows the position of the grid colours it can pick out some relative information from the grid.

daemonologist6mo ago

ryoshu6mo ago

I can't do that either without opening up an image editing tool. Give the model a tool and goal with "vision". Should work better.

sdenton46mo ago

jacobsenscott6mo ago· 4 in thread

> here's no other way to do it besides getting Claude to recreate it from a screenshot

And

> I'm an engineering manager

I can't tell if this is an intentional or unintentional satire of the current state of AI mandates from management.

master_crab6mo ago

Honest question: does he know about F5? Or was it intentional to use screenshots when source is available?

1 more reply

chilmers6mo ago

You really can’t tell? Perhaps the bar for AGI is lower than I thought.

thecr0wOP6mo ago

lololol

dmd6mo ago

i can’t tell if your comment is satire or not

isoprophlex6mo ago· 4 in thread

Calling Claude (or GPT-5 or Gemini or my bash terminal for that matter) a "he" seems absurd to the point of hilarity.

In my mind, they've always firmly been "it"s.

DocTomoe6mo ago

retsibsi6mo ago

bojan6mo ago

This felt quirky to me as well, possibly because my native language is strictly gendered.

fragmede6mo ago

micromacrofoot6mo ago· 2 in thread

I wouldn't call it entirely defeated, it got maybe 90% of the way there. Before LLMs you couldn't get 50% of the way there in an automated way.

> What he produces

thecr0wOP6mo ago

micromacrofoot6mo ago

Aside from that point: if you are reading this and making people do a project as part of the hiring process, you should absolutely be paying them for their time (even a token amount).

stared6mo ago· 2 in thread

Just use Playwright Skill (https://github.com/lackeyjb/playwright-skill). It is a game changer. Otherwise it is Claude the Blind, as OP mentioned.

pcwelder6mo ago

But that's cheating because it then has the source code containing the table and its styles.

I can confirm that this is what it does.

And if you ask it to not use tables, it cleverly uses div with the same layout as the table instead.

drbacon6mo ago

I think the idea is to let Claude see iterations of the reproduction with playwright, but still only allow access to screenshots of the original.

mxfh6mo ago· 2 in thread

Everything feels wrong with that approach too me, starting with calling a perfectly time-appropriate website anachronistic.

Anachronistic would be something like creating an apparent flash website for a fictional 90s internet related movie.

robomc6mo ago

He's using it correctly, in its secondary sense of "belonging or appropriate to an earlier period, especially so as to seem conspicuously old-fashioned or outdated."

mxfh6mo ago

Still not quite convinced that the adjective should be applied to the website itself in a quite loose use of the word.

Othwerwise there is no anachronism here with the website itself, just it's location under the original URL and not in some archive only.

The website itself fulfilled its purpose for promoting the movie when it was released and simply continues to exist.

You wouldn’t call posters, magazines, or other artifacts from the ’90s anachronistic just for still existing. Being retrievable doesn’t make something outdated by itself.

“Anachronistic” would apply only if a new promotional site were created today to look like this—though that would more likely be called “retro.”

Or if the movie industry insisted on using CSS-free table layouts for all its promotional websites, similar to other norms or laws that feel anachronistic because they no longer match current needs.

Sadly the whole piece reads like it was written 80%+ by an LLM too, seriously why all the emojis? But apparently this is where content is heading in general.

anorwell6mo ago· 2 in thread

The article does not say at any point which model was used. This is the most basic important information when talking about the capabilities of a model, and probably belongs in the title.

thecr0wOP6mo ago

Whoops, I'm very dumb. It's Opus 4.1. I updated the blog post and credited you for the correction. Thank you!

dweekly6mo ago

That model does not exist. Do you mean Opus 4.5?

2 more replies

960design6mo ago· 2 in thread

kristofferc6mo ago

Can you link to the conversation log?

pbhjpbhj6mo ago

In the generalised solution there is a '-c' term with coefficient '4a'...?

I'm not well at the moment, perhaps your ML model has flu?!!

pluc6mo ago· 2 in thread

I like how the author calls a script on the internet "him".

thecr0wOP6mo ago

lol

NooneAtAll36mo ago

better than using a plural for a single entity

3 more replies

manmal6mo ago· 2 in thread

fluidcruft6mo ago

epgui6mo ago

At that point you may as well just do the work yourself.

2 more replies

fortyseven6mo ago· 2 in thread

Look at that stupid dog. It's reading a book, but it's really trashy YA. It's not even Shakespeare. Dogs are stupid.

satisfice6mo ago

But the dog's owner keeps saying that it ONLY reads Shakespeare. The dog's owner wants millions of dollars for the dog on that basis.

I'm not mad at the dogs, I'm mad at the stupid investors and the lying owners.

thecr0wOP6mo ago

lol, that is fair criticism

stonecharioteer6mo ago· 2 in thread

I'm curious. Did you ask it to use tables and no CSS?

In 1996, We had only css1. Ask it to use tables to do this, perhaps.

lagniappe6mo ago

We actually had a myriad of competing specs. CSS proper wasn't released until december 1996.

1 more reply

thecr0wOP6mo ago

Yes yes great question!

I tried your suggestion and also tried giving it various more general versions of the limitations presented by earlier generations.

Claude's instinct initially was actually to limit itself to less modern web standards.

Unfortunately, nothing got those planets to be in the right place.

handedness6mo ago· 2 in thread

A site in '96 would have been built largely with tables, not CSS. CSS didn't become a thing until a couple of years later.

I know this because I'm still salty about the transition. For all of CSS's advantages, we lost something when we largely moved away from tables.

chrisco2556mo ago

We lost something for a time but flex box and CSS grid (which came more recently) are objectively better than tables.

1 more reply

yesitcan6mo ago

You can still use tables.

1 more reply

vrighter6mo ago· 2 in thread

"there is no other way to preserve it"

Bullshit. Right click -> view source

Or just press ctrl+s and the browser will also gather all the assets into a folder for you.

The arrogance of thinking that the only way you know how is the only way....

You literally forgot the save feature all browsers have just because you set out to "solve" this using "ai"

OkGoDoIt6mo ago

The statement is almost certainly made in jest, since it is obviously untrue. Sometimes adding silly artificial constraints can be a fun way to spark creativity.

Mashimo6mo ago

Please read the sentence right after the one you quoted. You removed to context to shit on him :(

throwaway3141556mo ago· 2 in thread

CharlesW6mo ago

Using https://github.com/anthropics/claude-code/tree/main/plugins/... with style-supporting instructions and context would've improved the outcome as well.

2 more replies

thecr0wOP6mo ago

Not a homework assignment, and no deliberate attempt to conceal information, just very long and repetitive logs. A lot of the same "insights" so I just didn't provide them here.

> That's not how to successfully use LLM's for coding in my experience.

Yeah agree. I think I was just a little surprised it couldn't one-shot given the simplicity.

docheinestages6mo ago· 2 in thread

> Note: please help, because I'd like to preserve this website forever and there's no other way to do it besides getting Claude to recreate it from a screenshot.

Why not use wget to mirror the website? Unless you're being sarcastic.

$ wget --mirror --convert-links --adjust-extension --page-requisites --no-parent http://example.org

Source: https://superuser.com/questions/970323/using-wget-to-copy-we...

thecr0wOP6mo ago

malfist6mo ago

Because that wasn't the goal of this exercise

yosito6mo ago· 1 in thread

That being said, sloppy copy can make doing actual work a lot faster if you treat it with the right about of skepticism and hand-holding.

It's first attempt at the Space Jam site was close enough that it probably could have been manually fixed by an experienced developer in less time than in takes to write the next prompt.

jama2116mo ago

charcircuit6mo ago· 1 in thread

>I'd like to preserve this website forever and there's no other way to do it besides getting Claude to recreate it from a screenshot.

There are other ways such as downloading an archive and the preserving the file in one or more cloud storages.

https://archive.is/download/cXI46.zip

a21286mo ago

The sentence immediately after that would imply sarcasm

jonas216mo ago· 1 in thread

https://news.ycombinator.com/item?id=46193412

As of right now, it seems to have been flagged into oblivion by the anti-AI crowd. I found both posts to be interesting, and it's unfortunate that one of them is missing from the conversation.

thecr0wOP6mo ago

yeah that sucks. I also linked it to the top of my post so folks can take a look.

johnfn6mo ago· 1 in thread

Context is king. The problem is that you are the one currently telling Claude how close it is and what to do next. But if you give it the tools to do that itself, it will make a world of difference.

alexandre_m6mo ago

I'd be curious to see how Antigravity compares for the same task with its automatic browser agentic validation logic.

pfix6mo ago· 1 in thread

COAGULOPATH6mo ago

And they do hacky things like space elements vertically using <br> tags.

buchwald6mo ago· 1 in thread

MagMueller6mo ago

Interesting read. Agree that GUI is super hard for agents. Did you see "skills" from browser-use? We directly interact with network requests now.

torginus6mo ago· 1 in thread

tyre6mo ago

My boy loves a neon gradient.

To be fair, a lot of startup websites look very similar. And the number of Stripe blurples out there was a pre-LLM brand meme.

shortformblog6mo ago· 1 in thread

Claude can't properly count the number of characters in a sentence. It's asking a lot to assume it can get pixel perfect.

hu36mo ago

You're just holding it wrong if you're asking it to simply count the characters in a sentence.

liampulles6mo ago· 1 in thread

It seems to me that Claude's error here (which is not unique to it) is self-sycophancy. The model is too eager to convince itself it did a good job.

I'd be curious to hear from experienced agent users if there is some AGENTS.md stuff to make the LLM more clear speaking? I wonder if that would impact the quality of work.

aprilfoo6mo ago

> It seems to me that Claude's error here (which is not unique to it) is self-sycophancy. The model is too eager to convince itself it did a good job.

It seems this applies to the whole AI industry, not just LLMs.

p0w3n3d6mo ago· 1 in thread

vghaisas6mo ago

You might enjoy these AI generated clocks: https://clocks.brianmoore.com/

victorbuilds6mo ago· 1 in thread

Building something similar - using Claude API to generate mini games from text descriptions (https://codorex.com, still pretty rough).

The suggestion upthread about having it write its own measurement tools seems promising - haven't tried that approach yet.

victorbuilds6mo ago

Actually, here's a relevant example - had Claude generate an interactive solar system with circular orbits and moons: https://codorex.com/shared/yeABdJWvRHAKqHs2kxpRnZNZPWmqWSu9

1 more reply

Aeolun6mo ago· 1 in thread

I think claude could have easily used a script to calculate the positions of the planets exactly here, instead of trying to use the frankly horrible image recognition.

fluidcruft6mo ago

Claude could probably have written a script to do it. I'm hunting assets to confirm.

[ some venv stuff/telling claude to use uv and python 3.13 ]

CLAUDE: The script found the precise location of p-pressbox.gif within screenshot.png:

  | Measurement         | Value           |
  |---------------------|-----------------|
  | Top-left corner     | (87, 110)       |
  | Bottom-right corner | (218, 166)      |
  | Center              | (152, 138)      |
  | Template size       | 131 × 56 pixels |
  | Match confidence    | 97.2%           |

  The script is at /tmp/search/find_image.py and uses OpenCV's template matching with normalized cross-correlation.

ME> As part of the output create a visualization of the @p-pressbox.gif positioned on top of the original screenshot

CLAUDE: Done! The visualization shows a green rectangle highlighting the exact location where p-pressbox.gif was found within the screenshot. The match is at coordinates (87, 110) to (218, 166).

I confirmed the result was correct.

1 more reply

city176mo ago· 1 in thread

Lovable was able to one-shot it (except for the top margin): https://spacejam-pixel-perfect.lovable.app/

All I did was give Lovable a single prompt: "with pixel perfect accuracy recreate this website: https://www.spacejam.com/1996/"

Edit: Commenter below may be right.

pcwelder6mo ago

It just fetched the HTML and replicated it. The usage of table is a giveaway.

Any LLM with browser tool can do it (Kombai one shots it too for example), because it's just cheating.

1 more reply

simonw6mo ago· 1 in thread

I wonder if Gemini 3 Pro would do better at this particular test? They're very proud of its spatial awareness and vision abilities.

music4airports6mo ago

>They're very proud of its spatial awareness and vision abilities.

Suuuuuuuuure they are.

I haven't found a single multimodal model, vision LLM, or any model at all that can segment and extract music charts/infographics.

Can Gemini 3 Pro, in one shot, turn charts like these into lists of "artist - album" without choking on the visuals?

https://reddit.com/r/citypop/comments/10fu1t5/city_pop_album...

https://reddit.com/r/indieheads/comments/173o33z/the_new_ind...

1 more reply

rickcarlino6mo ago· 1 in thread

I look forward to an alternative reality where AI vendors race to have the model with the best Space Jam Bench scores.

jacquesm6mo ago

It beats pelicans by a considerable margin.

syassami6mo ago· 1 in thread

We've lost the capability to build such marvels.

https://knowyourmeme.com/memes/my-father-in-law-is-a-builder...

barfoure6mo ago

Lost it at wooden structures at the playground.

mr_windfrog6mo ago· 1 in thread

Maybe we could try asking Claude to generate code using <table>, <tr>, <td> for layout instead of relying on div + CSS. Feels like it could simplify things a lot.

Would this actually work, or am I missing something?

thecr0wOP6mo ago

I think it probably gets you 80% but the last 20% of pixel perfection seems to evade Claude. But I'm pretty new to writing prompts so if you can nail it let me know and I'll link you in the post.

RagnarD6mo ago· 1 in thread

Why not just feed it the actual instructions that create the site - the page source code, the HTML, CSS, JS if any?

masswerk6mo ago

This may be an even greater challenge: analysing a table layout and recreating it in CSS with absolute positioning.

subleq6mo ago· 1 in thread

What if you gave it an image comparison tool that would xor two screenshots to check its work?

poly2it6mo ago

QuadrupleA6mo ago· 1 in thread

Wasn't it now (end of 2025) that Dario Amodei said Claude (or LLMs in general) would be doing almost all programming work?

This article is my typical experience with LLM coding. Endless correction and handholding, and manual cleanup of subtle mistakes. With no long-term learning from them.

Kinda makes me livid, the amount of false hype coming out of the mouths of the stewards of these investor-subsidized LLM companies.

But they're amazing Google replacements, and learning tools. And once in a blue moon they ace a coding assignment and delight me.

Edit: 90% of coding work by June to September 2025: https://www.businessinsider.com/anthropic-ceo-ai-90-percent-...

pwython6mo ago

I mean, Claude got 90% there. The author could contribute 10% of work to fix the alignment and have a finished page.

1 more reply

supern0va6mo ago· 1 in thread

Honestly, if you had showed this article to me even eighteen months ago, I would have been blown away at how good of a job Claude did.

It's remarkable how high our expectations have been steadily creeping.

WhyOhWhyQ6mo ago

1 more reply

thenumpaduser6mo ago· 1 in thread

We are actually spoiled at this point.

echelon6mo ago

[flagged]

1 more reply

a-dub6mo ago· 1 in thread

maybe ask it to use 1990s table based layout approaches?

al_borland6mo ago

Interesting. I just looked at the page source and it is in fact using a table layout. I always assumed it was an image map, which I assume would be even more obscure for the LLM.

1 more reply

Tokkemon6mo ago· 1 in thread

Why do I feel like the old man yelling at clouds that programmers refuse to use their brains anymore?

mrwaffle6mo ago

It's funny, the joke that good programmers are lazy, has always held until now.

computersuck6mo ago· 1 in thread

this is just AI brainrot disease

Help, I can't recreate a website with AI! There's no other way, no way I could fix up some HTML code! Believe me, I'm an engineering manager with a computer science degree!

Absolutely disgusting.

derektank6mo ago

I’m pretty confident the author was being sarcastic

th0ma56mo ago· 1 in thread

thecr0wOP6mo ago

1 more reply

dreadnip6mo ago· 1 in thread

Why involve an LLM in this? Just download the site?

ChrisArchitect6mo ago

Yeah, Internet Archive has lots of copies https://web.archive.org/web/20250000000000*/https://www.spac... also

What's with the panicked pleas and need to preserve the site, assuming locally...?

1 more reply

johncoatesdev6mo ago· 1 in thread

You last-minute cancelled coffee with your friends to work on this? I'm not sure how I would feel if a friend did that to me.

fishtoaster6mo ago

Based on the later life updates, I suspect this was being humorous.

> After these zoom attempts, I didn't have any new moves left. I was being evicted. The bank repo'd my car. So I wrapped it there.

soared6mo ago

https://aistudio.google.com/app/prompts?state=%7B%22ids%22:%...

PaulHoule6mo ago

(2) The kind of dialog he's having with Claude is a kind of communication pattern I've found never works with LLMs. Sure there is the kind of conversation that goes

   Do X

   ... that's pretty good except for Y

   Great!

but if it is

   Do X

kaijia6mo ago

I almost did it with the plain claude code on a $20 plan, without any skills, by leveraging my domain expertise with the following prompt:

  Your goal is to create an HTML5 website that is pixel-accurate as the screenshot @Space-Jam.png . You can use the image files in the `img` folder. To achieve pixel-level reproduction, first write auxiliary programs to measure the pixel locations of the objects on this page. Then use your measurements to create the page. Consider using feature detection/matching with per-object (x, y) offsets and a global scale factor shared by all objects as the free parameters. Consider using RANSAC for robust estimation. After obtaining an initial estimation, crop the image of each object and use local sparse optical flow for refinement. Use JPG format for visual inspection of intermediate results. Your webpage should be able to scale with the window size.

Note: the footer text size/locations are off. We can probably fix that by explicitly asking cc to write scripts for text bounding box detection.

More details here: https://gist.github.com/jia-kai/802de63816711d67f0a090fa267a...

neuroelectron6mo ago

My web-dev friend saw the original Space Jam site. I asked him what it would cost to build something like that today. He paused and said:

We can’t. We don’t know how to do it.

stwsk6mo ago

>Look, I still need this Space Jam website recreated.

Now that's a novel sentence if I've ever read one.

jdironman6mo ago

[1] https://arxiv.org/abs/2511.09030

voodooEntity6mo ago

Im really curious how especially junie will act and i will give it a try with the very same task you gave it. We gonne see how it ends :D

DocTomoe6mo ago

I'm more surprised at how close Claude got in its reimagined SpaceJam-site.

dmitrygr6mo ago

Them: AI will take jobs.

The AI: https://chatgpt.com/share/6923df03-7304-8010-bd08-cd335f0ee9...

Madmallard6mo ago

nickdothutton6mo ago

ajasmin6mo ago

I'm actually surprised Claude was about to do that much.

I hadn't even considered handing it a visual mockup to work from. Event though that workflow is par for the course for any web design team.

I would assume there must be at least some prior work into locating individual assets in a larger canvas. It just needs to be integrated into the pipeline.

ettsvensktlogin6mo ago

Thanks to this post I now have a deeper understanding as to why. Thank you.

vmg126mo ago

We don't know how to build it anymore

1970-01-016mo ago

tehjoker6mo ago

Hmm you note that the problem is the LLM doesn’t have enough image context, but then zoom the image more?

Why not downscale the image and feed it as a second input so that entire planets fit into a patch and instruct it to use the doensampled image for coarse coordinate estimation

hodder6mo ago

badlogic6mo ago

Loved the fun write up. Now that we know that LLM-based vision is lossy, here's a different challenge:

Give the LLM access to the site's DOM and let it recreate the site with modern CSS. LLMs are much better with source code, aka text, right? :)

BiteCode_dev6mo ago

Common at the time before flash took over.

taf26mo ago

In my experience codex has been better at details like this. But who knows working with an llm much like another engineer is all about how you ask and then how you iterate with the llm.

bdcravens6mo ago

A comparison would Codex would be good. I haven't done it with Codex, but when working through problems using ChatGPT, it does a great job when given screenshots.

iwontberude6mo ago

Apropos given Warner Brothers Discovery just sold to Netflix

succo6mo ago

Why not ask Claude to recreate the website by scraping it?

kvirani6mo ago

Sibling post here: https://news.ycombinator.com/item?id=46193412

theahura6mo ago

I was able to get Claude to do this, though it kinda sorta cheated. Blog post describing the output here: https://theahura.substack.com/p/i-successfully-recreated-the...

TLDR:

After trying a few ways to get the stars to line up perfectly, it just gave up and copied the screenshot in as the background image, then overlaid the rest of the HTML elements on top.

I’m tempted to give this a pass for a few reasons.

This obviously covers the original use case that tripped up Jonah.

It also is basically exactly what I asked the model to do — that is, give me a pixel perfect representation — so it’s kind of my fault that I was not clearer.

grouchy6mo ago

Has anyone found techniques to improve Claude's spatial/layout understanding?

It seems Nano Banana Pro can understand the layout/spatial well.

geldedus6mo ago

AI-assisted coding is a learned skill. You'll have to do it right, with the right approach and the right tools.

sema4hacker6mo ago

> The total payload is under 200KB.

Just out of curiosity, how big was what you considered Claude's best attempt to be?

system26mo ago

Use Claude for Python. That's it. Don't push it for the frontend, it won't do well.

hestefisk6mo ago

Would be interesting to see whether Gemini could crack this problem.

zitterbewegung6mo ago

baq6mo ago

new LLM benchmark just dropped. 'draw an svg of a pelican riding a bicycle browsing spacejam 1996 on 640x480 ie6'.

yieldcrv6mo ago

Skill issue

computersuck6mo ago

Why not just host a copy from waybackmachine?

phplovesong6mo ago

This basically boils down to AI being unable to "center a div". I see this very often, AI generated slop is has LOTS of "off by one" kind of bugs.

j / k navigate · click thread line to collapse