Yes ai can’t see, it only understands numbers. So tell it to use image magick to compare the screenshot to the actual mockup, tell it to get less than 5% difference and don’t use more than 20% blur. Thank me later.
I built a whole website in like 2 days with this technique.
Everyone seems to have trouble telling ai how to check its work and that’s the real problem imho.
Truly if you took the best dev in the world and had them write 1000 lines of code without stopping to check the result they would also get it wrong. And the machine is only made in a likeness of our image.
PS. You think Christian god was also pissed at how much we lie? :)
Here is the line from my Claude code to get something like this. Keep in mind I didnt use mcp for playwright with this particular implementation but it is my preferred method currently. Tha
CRITICAL - When implementing a feature based off of an image mockup, use google chrome from the applications folder set the browser dimensions to the width and height of the mockup, capture a screenshot, and compare that screenshot directly to the mockup with imagemagick. If the image is less than 90% similar go back and try and modify the code so that way the website matches the mockup closer. If a change you make makes the similarity go down, undo it, and try something else. be mindful the fonts will never be laid out exactly like the mockup, please use blur at a max of 10% to see if the images are closer matching. If you spend more than 10 cycles screen-shotting and comparing, stop and show the user how similar they are mentioning any problems
The more text the harder it becomes and it’s why we really need the blue because fonts are almost always rendered differently.
There are some interesting issues that probably relate to your workflow, like the nav links are different sizes, the icons too. And the resolution of some of the images/icons on a MacBook is poor. But I suspect that's because a simple ImageMagick raster diff will fuzz over those kind of differences.
I wonder if you can make some tweaks or find a better representation than pure raster screenshots to fix this. Can't really deal in vector images because AI sucks at outputting those, and you can't print a web page to SVG.
There was a super niche website framework that only used SVG a while ago. Would be funny if that kind of thing makes a takes off just so AI can do better.
I will grant you that this is more tasteful than most of the AI sites I see. It’s a good looking little site but nothing here screams, “AI really accelerated this.”
1. The main page asks for an email to be notified when the hoodie is available to buy, but I can add the goodie to my shopping cart and proceed to check out 2. The product page mentions a 6’ model but there is no model in the images 3. The check out page says “there are no payment options, please contact us”
I built this frontend with Sonnet 4.5 last Fall and I’m about to “launch” it
I used only prompts, but those prompts included ChatGPT’s research on Memphis design ;)
Using codex for front end design is like asking the valedictorian mega nerd to paint your portrait. Gemini and Claude are both artists.
Very bad results—as expected from an AI.
Nothing to brag about here.
- Text isn't selectable on the page.
- The tooltip in the "day 1" to "day 14" cards gets cut off by the border (I see this mistake ALL the time with AI-generated frontends btw)
- It's sparse and very long. I think the information could be condensed in half the size, and it would improve the presentation. This is personal preference though.
- The playbooks' "mark complete" are not persisted on reload or navigation.
All in all, it's functional and quite decent. I agree with the other people saying it looks generic, but I disagree on it being necessarily a bad thing for this kind of product.
I know nothing about pools so I can't comment on the accuracy of the playbooks. It's nice that there's so many of them, but given the LLM vibe of the text I'm slightly suspicious.
What you have got as output is what I also get as output from llm’s - they suck the soul out of everything. Which is fine in the right context but that shouldn’t we as a species strive for in design imo.
I am still trying to learn how to wrangle Claude properly, but I have this Claude.md[1] for that I used to make the website. In particular one of the last rules about using imageMagick for comparison.
I haven’t touched this website in a bit (waiting on client) so now I use playwright mcp for the screenshots and the browser interactions.
[1] https://github.com/panda01/hartwork-woocommerce-wp-theme/blo...
I built https://bridge.ritza.co (demo@example.com username and password if you don't want to sign up) as a trello/linear replacement without looking at a single line of code and it's both good enough for me and doesn't have the obvious AI frontend 'look' as it was copying from the starter.
Check out the other reply and scroll down a bit…
Share it. I used Claude earlier to test out its design capabilities and what I got as output was flat and tasteless.
I don't mind working through a lot of the UI myself, but it's definitely a shortcoming IMO... that said, being able to scaffold boilerplate or testing harnesses for for complex UI has been really nice overall. I came up with the following component as an image zoom component, where I can separately control the zoom in/out in under a couple hours... took longer to setup the CI/CD stuff than the primary component logic.
The reason for the post is that even without the actual website one should be able to envision the technique and how it may or may not work. Also if you look above recently I added links to the Claude.md for another thing I was working on for a friend that also had to solve this problem.
Just want to give people the tools to use ai well from my own findings
Feeding the model images for my local computer sounds like a recipe given my experience with the tools to have it over-optimize for the wrong end device.
It goes completely out of the window if the browser window isn't the exact size of the mockup.
You might charitably say that pixel perfect means that the implementation intersects with the design comp at some specific dimensions but where are the extra rules coming from, then?
It's an archaic term that conflates the artifact produced by an incomplete design process (an artist's rendering of what the web page might look like) with the actual inputs of the development process (values and constraints).
Here's an example that I personally encountered: what if you have a <h1>Text</h1> and it has a certain left margin. Then another heading except it has a nested button component (which internally comes with some padding). Then the "Text" in both aren't aligned from section to section and it is jarring.
Perhaps the results would be different if you had a specific novel design or interaction in mind, and you wanted the AI to implement that exactly as you wanted.
edit: My point proven by the other examples from this thread. Same format, same "feature cards" etc. https://bridge.ritza.co/ https://poolometer.com/
The landing page looks like every other AI slopped product page out there.
> Your data,
> irectl in you
> spreadsheets
I'm guessing the third word is "directly“? The D is cut off. And the grammar is wrong, should be "in your spreadsheets" - maybe that is another letter cut off?Go back to human devs.
I've also used AI to build frontends that I'm more than satisfied with, and I think it can "see" perfectly fine. The frontier models are multi-modal and pretty good at vision. You can hook up your coding harness to your browser which will take screenshots of your rendered frontend and modify the code accordingly.
Was about to say the same thing
When I spend a lot of time in planning mode, I tend to get a lot more value out of the output and have to redirect far less. It also helps to establish your API interfaces, reference points, interactions, behaviors and even a lot of the test harnesses ahead of development cycles. You need to define a lot more ahead of letting it go.
I would say that I'm getting maybe 2.5x the value and 5-10x the output from AI... by value, I mean what the end user/customer cares about... by 5-10x I'm including the increased documentation, testing, etc.
I think it's the other way around. AI amplifies your software development skills. If you suck at software development, AI will follow your prompts and feedback and of course it will output an unmaintainable mess that barely works.
Here we are, listening to people who can barely put together a working website complaining that AI can barely put together a working website.
Honestly, I'm probably one of the biggest skeptics when it comes to GenAI - but at least for music, the recent models (as in the past year) do not suck. They are actually really, really good for what it is.
I have yet to hear anything truly original produced by those models. They seem to converge to the mean, and end up sounding very commercial, very average sounding - but in the sense of average "professional music". Suno can generate music which would have taken real people years to learn, thousands of dollars of equipment to make / produce, and pretty much ready for airplay - most listeners will not bat an eye.
Hell, these "AI artists" have been booked to festivals, since people can't hear the difference, and are enjoying the music.
I figure it will go the same way in other fields. The average consumer loses track of what's human made and what's AI made, and frankly won't care. The people "left behind" are the artists, craftspeople, etc. that are frustrated it came to this point.
Our idea of nostalgia was not that long ago. Also it could be generated on open weight local copyright free models that are super efficient in the future :P
I think that was the point being made; if you're looking at it from the perspective of being really good at something, its tendency towards an averaged result is substandard.
Copying something that exists isn’t particularly difficult. It may require immense skill and incredible dexterity in the case of some musical instruments, but it doesn’t really require much more than time, patience and the ability to follow instructions. The blueprint already exists. With LLMs we now have the ability to skip the time and patience parts of the equation, we can produce mediocrity more or less instantly.
I don’t see this as particularly different from what happened at the turn of the last century and beyond, with machines being able to sow faster, carve wood and metals at a higher pace and precision, moving folks and goods between geographical points faster than ever before, etc. etc. It’s not much different from the IKEAs of the world making mediocre copies of brilliant designs, making fortunes selling to the large masses that think good enough is just great. Because honestly man, most of the time it probably is.
I’m not surprised people go to concerts to hear a recording made by an LLM either. People have been going to see DJs sling records for decades. It’s not the music, or the artist, it’s the community. Beyoncé is an amazing singer, but people don’t necessarily come to her shows to see just her, they come to see everyone else. They might say they want to see her, but they already have a thousand times in tickelitock and myfacespacebookgrams. They come to feel connected to something, to experience community.
LLMs are incredibly good at churning out stuff. Good stuff, bad stuff, just a ton of stuff. Nothing original but that’s ok, most things pre-LLMs weren’t either. We just have more of it now, and fewer trees. The creatives that are able to harness these tools will be able to do more with less. (Ostensibly at least, until the VC subsidies… subside.) Because they are creative they might be able to form an original idea and string together enough mediocrity to realize it. They’ll probably get drowned out in a sea of mediocre copies in the end, but that’s just the same as it always was. It’s just faster now.
The platform owners and hardware manufacturers will remain king until the technology can run on my TI calculator, maybe we’ll get there before the VC money runs out. No wonder Nvidia’s been killing it. Creativity and originality will return once this bubble bursts I’m sure, the world has this amazing ability to correct itself, even if violently so at times. Or we all die perhaps. Either way, all we can do I suppose is ride this wave of mediocrity into the sunset. :o)
The people I work with who find "AI" makes every part of their life easier were just bad at everything to begin with. The people who find "AI" making specific tasks easier have specialized skills and were previously relying on less specialized people for some support.
Nah, just at that something :-)
Don't get me wrong, AI can definitely be used as a tool by someone who knows what they're doing to avoid boilerplate. But anyone using it in a domain they aren't already an expert in will unknowingly accept AI f ups.
At the things I’m good at, AI is a huge boon. At the things I’m bad at, AI has little to offer.
For me, for Frontend, AI is great, because I know exactly what to do, so it’s very easy to talk it into doing it for me. I know what the problem is, I know what the solution is, and I have the language to communicate both. All that’s left is the trivialities of the implementation, I’ve already done all the hard work in my head.
The "consultants wrote sloppy code" is one of those memes that never die.
The only thing that differentiates consultants from you is the contract type. All broad strokes accusations are just a consequence of in-house employees feeling threatened by their presence and having a vested interest in portraying themselves as infinitely better than any prospective replacement. You also see the same attitude in junior devs who complain that everyone else's code is shit, but the mess they themselves created is always justifiable and understandable.
If you were moved from your project right now and you placed someone at your spot under probation, I will guarantee that your work would be extensively criticized for being an unmaintainable pile of hacks.
Your comment is one of those that feel intuitively right, because what you say makes sense... until faced with reality.
Most consultants that most permanent employees are likely to find are those that will do a crappy job, then be gone when shit hits the fan. Source: anyone who's ever worked with them, myself included. Actually, both sides of the desk. They tend to do crappy jobs because those are the incentives they have.
You can argue till you're blue in the face, but your theory cannot push aside the actual experience of many if not most of us.
Of course, the occasional scenarios where the consultants are solid and doing top-notch work exist, but what matters is the majority of what happens... and it's not good.
So the meme won't die, because it reflects reality.
It's not passable even slightly.
Everybody with experience knows that FE has always been "harder" than BE - but BE the stakes are higher since it's the business. FE is often "just UI" and despite that being very important too, you can throw it away and start over a lot easier with a UI than you can with a BE platform.
I digress, AI sucks fucking dick at UI.
My experience developing on a fairly standard SAAS is that AI (Figma, Claude) has produced UIs which look better, are more accessible, and get through review and QA and product approval, faster and more reliably than any of the FE developers I've worked with recently can.
A quick profile on Safari shows some layout recalc happening regularly, but surely that shouldn't cause this bad of perf...
The last time I found something like this, it was because of 100's of box-shadows.
Edit: Sure enough, this cures Safari:
*, *::before, *::after { box-shadow: none !important; background: none !important }
It's a combination of box-shadows and gradients.Edit 2: Ah, they're using shadow DOM for the img reflection, so we can't affect it. Good gravy is the shadow DOM stuff overwrought, it's 87 elements all told, just for one img.
Meanwhile, elsewhere in this discussion, you can read examples of people who have used AI entirely to build their product. Previously they'd have had to hire a developer or become one themselves. This suggests to me it is indeed replacing developers - just not where you are choosing to look.
Decent still way too high a compliment for such slop.
Anyway.
Do people get the impression that LLMs are worse at frontend than not? I'd think it's same with other LLM uses: you benefit from having a good understanding of what you're trying to do; and it's probably decent for making a prototype quickly.
To quote the article:
1. "It trained on ancient garbage" which is the by product of massive churn and this attitude leads to even more churn
2. "It doesn't know WHY we do things" because we don't either... even the paradigms used in frontend dev have needlessly churned
My fix? I switched from React/Next to Vue/Nuxt. The React ecosystem is by far the worst offender.
Good design is not always logical. Color theory, if followed, results in pretty bad experiences. And interestingly, good design can't always be explained in a natural language.
Main thing is, it's very hard to get AI to have taste, because taste is not always statistically explainable.
The best I've gotten to is have it use something like ShadCN (or another well document package that's part of it's training) and make sure that it does two things, only runs the commands to create components, and does not change any stock components or introduce any Tailwind classes for colors and such. Also make it ensure that it maintains the global CSS.
This doesn't make the design look much better than what it is out of the box, but it doesn't turn it into something terrible. If left unprompted on these things, it lands up with mixing fonts that it has absolutely no idea if they look good or not, bringing serif fonts into body text, mixing and matching colors which would have looked really, really good in 2005. But just don't work any more.
What are you using for the frontend? React component libraries?
Or is it that AI is not as creative?
Or do you mean something else?
If you are going to criticize LLMs for being out of date, at least make sure your understanding isn't out of date.
If I want good abstractions, sure, I can set up approvals and babysit it with reprompting, because it will do stupid things that an experienced engineer wouldn't. But the spaghetti also works in the sense that it takes the input types and largely correctly maps them to the output types.
That doesn't emarrass me with customers because they never see the internals. On front-end, obviously they will see and experience whatever abomination it cooks up directly.
Like I don't give it 100% responsibility for front end tasks but I feel like working together with AI I feel like I am really in control of CSS in a way I haven't been before. If I am using something like MUI it also tends to do really good at answering questions and making layouts.
Thing is, I don't treat AI as an army of 20 slaves will get "shit" done while I sleep but rather as a coding buddy. I very much anthropomorphize it with lots of "thank you" and "that's great!" and "does this make sense?", "do you have any questions for me?" and "how would you go about that?" and if makes me a prototype of something I will ask pointed questions about how it works, ask it to change things, change the code manually a bit to make it my own, and frequently open up a library like MUI in another IDE window and ask Junie "how do i?" and "how does it work when I set prop B?"
It doesn't 10x my speed and I think the main dividend from using it for me is quality, not compressed schedule, because I will use the speed to do more experiments and get to the bottom of things. Another benefit is that it helps me manage my emotional energy, like in the morning it might be hard for me to get started and a few low-effort spikes are great to warm me up.
The main limitation I think is that they're blind as a bat and don't understand how things stand visually and render in the end. Even the best VLMs are still complete trash and can't even tell if two lines intersect. Slapping on an encoder post training doesn't do anything to help with visual understanding, it just adds some generic features the text model can react to.
I will say though that multimodal capability varies between models. Like if I show Copilot a picture of a flower and ask for an id it is always wrong, often spectacularly so. If I show them to Google Lens the accuracy is good. Overall I wouldn't try anything multimodal with Copilot.
For that matter I am finding these days that Google's AI mode outperforms Copilot and Junie at many coding questions. Like faced with a Vite problem, Copilot will write a several-line Vite plugin that doesn't work, Google says "use the vite-ignore" attribute.
The design is still a problem though, precisely because I am not a designer. I don't know what's actually good, I only know what's good enough for me. I can't tell the difference between "this is actually good" and "this is vibe-designed slop" but I have enough experience to at least make sure the implementation is robust.
but people writing shitty node.js code might beg to differ.
Unrelated, but as a long time front-end dev, FUCK THOSE.