I use Copilot in VS Code as a typing assistant, but for the most part I find copy and paste into the chat interfaces is the most valuable way to use LLMs for coding.
Even more so with the Claude Artifacts feature, which lets me see an interactive prototype of frontend code instantly - and ChatGPT Code Interpreter, which can run, test, debug and rewrite Python snippets for me.
I found that it was pretty easy to get cursor into a place where it had decimated the file and each tab complete suggestion became more and more broken.
Claude was given the task wholesale and did a reasonable job but introduced a subtle bug by moving a tracking call outside of the Ajax promise and I could not convince it to put it right. It kept apologising and then offering up more incorrect code.
I’d say that the original result was good enough that I could pretty much take it and fix it, but only because I knew all the code and libraries well enough. It was only about 150 lines of simple code and by the time I’d finished I was joking with the team that I could have spent all the time wrestling vim macros instead and come out about the same.
What’s your experience been with correctness?
I've spent so much time with them now that 90% of the time I get back exactly what I needed, because I can predict what prompt will get me the right result.
You can just select a block of code, and tell it to (e.g.) “make this function work as a decorator with graceful exception handling” and it will modify your selected code and provide you with a nice diff to apply in one keypress.
Or you can chat with the LLM directly in VS Code, with every snippet easily applied with a click. It can even catalog your codebase in a vector DB for really easy RAG:
“Create a view that allows premium users to view their billing history”
“Okay, I’ve found a function called get_premium_status in auth/user_profile.py. I’ll use that to create api/user/billing_history.py”
(Which then shows the code it will add or change, separated by file, with the option to apply the change)
Then when I tried plain Claude the delta felt way way too small for me. The eng in me kicked in and I started hacking away on continue.dev + claude3 and the delta was very very little. Plus I could bring in more functionality with extra visibility in my own hacks the way I wanted it (macros) which I couldn't with cursor.
If I need to do something more high level or that requires multiple files I still copy and paste to Claude / ChatGPT.
So I built a small tool to streamline this copy-paste process and manage source code context to include in the prompt: https://prompt.16x.engineer/
Would recommend using a tool to save your codebase as an upload-able file.
Parent comment (simonw) has written a tool. I use another called ai-digest. Pick any one. It solves the 'my model doesn't understand my codebase' problem.
>Give it your openAI key if you want o1-preview.
o1 sounds slow and expensive for the usual AI conversational coding and autocomplete stuff. But it might be the right model for scaffolding of new projects and do non-trivial refactoring s of existing projects.Does Cursor support setting different models for different tasks?
As always, the requirements planning and communication is the hardest part of coding.
Maybe 1 out of 10 chats was actually time-saving
We're using CoPilot at work. When we were evaluating this, the question we asked our test group: How much time does it save you per week? And most people arrived at estimations of like 1-4 hours saved a week, especially when banging out new code in a code base with patterns and architecture. This was a good enough tradeoff to buy it.
Like, I recently got a terraform provider going for one of our systems. Copilot was useful to generate the boilerplate code for resources so I just had to fill in the blanks of actual logic. Or you can just hand it sample JSON and it creates go structs for those to use in an API client, and generates the bulk of methods accessing APIs with these. Or it's decent at generating test cases for edge cases.
It doesn't enable me to do things I could not do before, but it saves me a lot of typing.
Well maybe I wouldn't test my functions with all of these edge cases because that's a lot of stuff to do, heh.
Also I can note, that some languages are just naturally fit for Copilot, one of them is Go with its constant `if err != nil` incantations.
i had cursor then ran out of free credit. i had github copilot then found it too expensive.
given i'm a software engineer i'm basically looking for free. a.i. is like the t-shirt of the digital world, as in, i don't care where it comes from, just give me a free one when i use your product.
It's very fast though.
when i want to manual code i use visual studio code.
You probably got the wrong impression from their pricing plan page (https://www.cursor.com/pricing), which only mentions "Enforce privacy mode org-wide" for their $40/user/month Business tier. The point there is that the Business tier enables employees to be forced to use privacy mode at the organizational level. It doesn't mean that you must purchase the Business tier to enable privacy mode yourself.
My only nit is they override common keyboard shortcuts in IntelliJ and VS Code. Like Alt + Enter in IntelliJ.
I use Neovim, and it’s unfortunate that the plugin isn’t quite as full-featured as their plugins for other editors (e.g. VSCode). It works great for completions, but the “chat” functionality opens in a browser.
Still, it’s well worth the license cost. The completions I get from it save a ton of time, and are often much longer than what I’d get from my normal LSP - and more importantly, they’re generally “correct”.
There are definitely instances where Claude can’t solve a problem and I have to hand write the code and explain it.
It definitely gets confused with designing multiple modules together.
But there are times when it’s simply brilliant. I needed a specific inheritance pattern and Claude introduced the curious recurring template pattern, which had escaped my career. It’s not something I’d use in business code, but in this use case it was perfect.
Claude also helped me build a bi-directional graph to host my game world.
And Claude is phenomenal at unit testing whole projects, allowing me to find design flaws very fast.
My overall experience is that if you know what you’re building, GenAI can be extremely powerful, but if you don’t, I could see both good and bad results coming out.
The lessor experienced developer using it won’t know when it should direct the process in a certain direction and more senior developers will.
Only uses CLI, so you have two contexts you work in. One is you manually writing code just like you are used to. The other is a specific context of files that you inform the LLM you are working on.
By creating a separate context, you get much better results when you keep the tasks small.
Specifically use it with claude 3.5 sonnet
If you’re looking for “type in some English text and get fifty lines of code written”, Cursor’s chat is the best I’ve tried. But I’m not a fan of that workflow, so take my opinion with a grain of salt on that.
It’s a straight up pair programmer. Point it at your GitHub repository and then just converse with it. It drives, you look over its shoulder. imagine OpenAI GPT o1 that connects to your GitHub repo and produces diffs or PRs on command, and a chat gpt view with tabs for switching between the conversation and the diff.
Also, for front end, there is v0.dev — which is great for whipping stuff together.
fs.promises.readFile(in_fi|
You probably want 'utf-8' as a second parameter here, cause from the context I infer it’s a text file.
Also, I see there’s a stream set up above. You can simply pipe a file stream to it, if that’s why you’re reading it: <link>
Is there something like this?Kinda different from your specific use case, but should give some hints on which one would serve you best, and is an interesting watch:
I often submit my code for review to ChatGPT but 99% of issues that it finds make very little sense and 1% of issues usually just wrong, so I'm kind of disappointed, but may be I miss some good prompt or something.
llm of choice
1) Claude/ChatGPT (copy and pasting back and forth)
2) Cursor
- Copilot using Visual Studio and VS Code
- ChatGPT Plus / Claude, copy/pasting back and forth
- Cursor, free trial and w/ Claude api key
Copilot was like 30/70 good-to-bad. The autocomplete hijacks my mind whereby creating a mental block at my ability to write the next token of code. The suggestions were occasionally amazing, but multiple times it introduced a subtle bug that I missed and later spent hours debugging. Not really a time saver. I quit Copilot just as they were introducing the embedded chat feature so maybe it's got better.
In Visual Studio, I thought Copilot was garbage. The results (compared to using in VS Code) were just awful. The VS extension felt unrefined and lacking.
ChatGPT / Claude - this is a decent way to get AI programming. Multiple times it fixed bugs for me that just simply blew me away with it's ability to understand the code and fix it. Love it's ability to scaffold large chunks of working code so I can then get busy enhancing it for the real stuff. Often, it will suggest code using older version of a framework or API so it's necessary to prompt it with stuff like "For Next.js, use code from v14 and the app router". There is thought required that goes into the prompt to increase chances of getting it right the first time.
Cursor - ah, Cursor. Thus far, my favorite. I went through my free trial and opted into the free plan. The embedded sidebar is nice for AI chat - all of the benefits of using ChatGPT/Claude but keeping me directly in the "IDE". The cost is relatively cheap when hooked to my Claude api key. I like the ability to ask questions about specific lines of code (embedded in the current window), or add multiple files to the chat window to give it more context.
Cursor does a great job at keeping you in the code the entire time so there's less jumping from Cursor to browser and back.
Winner: Cursor
As a C#/Java backend developer, you might not like leaving IntelliJ or Visual Studio to use Cursor or VS Code. Very understandable. In that case, I'd probably stick to using ChatGPT Plus or paid Claude. I suggest the premium versions so for premium uptime access to the services and higher limits for their flagship models.
The free versions might get you by, but expect to be kicked out of them from time to time based on system demand.
After trying other options like Continue + deepseek-v2, I found that the expense of hosting a bigger local version of LLM is too high to match CodePilot's performance.
Played with Continue + Yi-Coder too - requires a lot of time to clarify requests to generate valid code.
I made the decision to stick with CodePilot.
Aider & Cursor is worth the try if you're interested in trying out multi-file edits. I think both are interesting (ergonomics-wise) but the current SOTA models do not perform well.
For single file edits or chat, I would recommend Cody (generous free tier) or some self-hosted alternative.
It is meant to be an autocomplete on steroids-ish feature where you will have to read through all the code it generated because at the end off the day it’s a black box you can’t trust.
But for low intelligence easy tasks it’s generally a fine product.
I feel like most AI coding assistants are though.
(Not affiliated with the company, it was called CodiumAI earlier)
ChatGPT/Claude for larger chunks
Aider for multi-file edits
It depends entirely on the subjective experience because everyone experiences things differently.
Don't expect any other offerings to change your mind. We are years away from AGI or anything generally useful in this area. It's only a matter of time until the rest of the world realizes this and stops the hype.
----
Like many here, I’ve been using both GitHub Copilot on VS Code and copy-pasting from a ChatGPT window.
Copilot is super hit or miss. Honestly, it rarely spits out anything useful unless I’m writing really repetitive code, where it might save me a few keystrokes. But even then, I could often just use some "Vim tricks" (like recording a macro or something) and get the same result. The built-in chat is a total waste of time... sigh.
ChatGPT has been way more helpful. But even with that, I often feel like it’s just a really fancy rubber duck or a glorified search engine. Still, it's way better than a Google/Bing search sometimes. I’ve been using a prompt someone here shared (maybe this one verbatim? [0] I need to shop for prompts again :-p) and that could be making a difference... I did not A/B test prompts but at least ChatGPT stopped apologizing so much lol.
I do want to try Cursor and Zed AI since I’ve heard good things. I also saw a recent post here about layouts.dev [1], and it looks really impressive. I’ve been asking ChatGPT for nice Tailwind CSS patterns, and the workflow in the that tool seems really streamlined and nice for web design (only caveat is... I'm not really interested in NextJS right now #shrug). BTW, nobody ever seems to talk about Gemini? I personally don't reach for it almost ever, for whatever reason...
----
Now for the part about scripting your LLM interaction yourself... I’ve been working on a passion project lately, a programming languages database. I stumbled across this cool pattern [2] where I write code that generates data, and that data can then be used to generate more code. (Code is data is code, right?). I used OpenAI's Structured Output [3] and after massaging TypeScript types and JSON Schemas for a while, it generated pretty easy to digest output.
The interesting part is that you can use this setup to feed prompts into ChatGPT in a much easier way. Imagine something like this:
const code = SelectThingfromCodeBase(); // Not necessarily SQL! Perhaps just concatenating your files as ppl mention here.
const answer = sendChatGPT(promptFrom(code));
const newCode = generateCodeFrom(answer);
profitFrom(newCode); // :-p
I think this pattern has a lot of potential, but I need to play around with it more. For now, I’ve got a super crude but working example of how I pulled this off for my little programming languages database (coming soon, hopefully :-p). I did this so me or a contributor can run a script to generate the code for a pull-request to add more data to my project.NOTE: my example isn’t very... "meta" since the data<->code thing doesn't really describe the project itself. To expand on this idea, we might need to dust off some of the old declarative tools like UML or 4GLs or come up with something inspired by those things. If this sounds vague, it’s because it is—but maybe it makes some sense to someone here :-p.
---
0: https://www.reddit.com/r/ChatGPTPro/comments/15ffpx3/comment...
1: https://news.ycombinator.com/item?id=41785751
2: https://github.com/EmmanuelOga/plangs2/blob/main/packages/ai...
3: https://platform.openai.com/docs/guides/structured-outputs