Tactical Stealth Alpha. Launching in 2025.
They aren’t getting a better or worse deal because they are in the autocomplete business or the avocado business
Marginal improvement over GitHub CoPilot or Cursor?
My guess is it was the second reason:
> wanting to grab a press cycle ahead of competitor announcements to build the waitlist.
It's a both external and internal hype play so everybody knows we "are changing the World", they just don't know it yet.
It's whatever, doesn't mean anything. The only hook I found is the "we got money from Eric Schmidt and got some hyper-local celebs from big companies like Google" line.
All I see is a small constellation of "big" names and an uninspiring "fine-tuned “industry-leading” model". Not poo-pooing anything, I'll just wait for the actual product.
Thank me later.
I found it amazing for 1 thing: filling up verbose configuration like Terraform or Kubernetes manifests.
For code, it’s horrible, to the degree that I spend more time rejecting Copilot suggestions or having to extensively modify the ones I accept.
I stopped using it couple months ago
It's also great for picking up new packages to do 'simple' tasks like reading in a specific file format.
Saves tons of time
It’s awful but in a helpful way. Like it will suggest “if $argument is empty, this function will return an error”, which is not true because $argument is an int, but it does do something wonky if you give it negative numbers so I remember to add that.
So like an intern/new hire asking what they think are basic questions but lead you to realize an edge case you missed?
knowing how to trigger it properly is key though, just like all LLMs the context it has determines the quality of its output
But definitely our team would be hit pretty hard if we had to drop copilot now.
Is the source the stats that it collects like you have accepted 60% of the reccomendatios ive noticed those are rather inflated. Ive yet to see one that deincrements its count even if yoy delete it rigjt after accepting, much less if you have to go back later and delete bunch of stuff it did subtly wrong.
The technique to using Copilot (as with any auto-completion) is to keep typing until you get a completion that is equal to what you were going to type. So yes you have to keep rejecting things but the rejection doesn’t take any more effort than normal auto completion.
For me I didn’t find the value I got out of typing 90% of the code I wanted for 10% completion to be worth the $20/month subscription.
Much of the value of Copilot is that it helps you navigate new things that you are not familiar with. For example, I use scipy/numpy a few times a month, so I am competent but not an expert by any means. Copilot is incredibly useful to suggest ways of doing things that are more idiomatic than what I was planning to write.
Another way that it is useful is that it massively reduces logical bugs such as off-by-one errors that aren’t caught by typechecking, since it’s unlikely that both me and copilot will make the same off-by-one error at the same time. If only copilot makes an off-by-one error then I will catch it because the suggestion will be different to what I expected. If I am about to make an off-by-one error then copilot will suggest something contrary to my expectation and that forces me to reconsider what I was about to write.
My favorite AI success story was a module of c# code that did text overlays on photos. It was written by hand. When we started encountering some issues with the graphics package it was using, I asked the AI (JetBrains in this case) to rewrite the module using a different package. Tedious but relatively straightforward and the results were correct first try. That was what convinced me that (eventually) these type of tools will be great.
Meanwhile, I’m so sick of seeing “my apologies, I misunderstood your request…” after it generates suggestions with non-existent methods, bad logic, or missing parameters.
The thing is I don't need to rewrite a function I've seen before, I'll use a library or framework that anyways offers all of that.
Wow. Does that mean that once the VC money dries up that will be the real cost of the products ($30-$90 a month)? That is much less accessible than $10.
I’ve used it constantly for a while now and assume all files in the project are basically training material to replicate what I’m doing. Annoying, but that’s what I get for not using local or a service from a privacy focused provider. Although I am using Claude in my scripts to distribute my footprints.
I’ve added my name to the wait list. This seems like a candidate for a company that intends to respect IP and that’s worth something.
Still the value gets worse for these cloud models all I have to do if I'm using say VScode is install a different extension and configure a few settings and bob's your uncle I'm using a different AI. What benefit exactly do companies get from blowing hundreds of dollars subsidising my usage again? Where's the moat? Where's the lock-in?
Even when the switching cost is lower than what you describe, people tend to keep using the product they had gotten used to.
Imagine this chip for JavaScript.
I'm really curious about this. I honestly don't see a case where I would use a coding assistant and everyone I have spoken to that does, hasn't been a strong coder (they would characterize themselves this way and I certainly agree with their assessment).
I'd love to hear from strong coders — people whom others normally go to when they have a problem with some code (whether debugging or designing something new) who now regularly use AI coding assistants. What do you find useful? In what way does it improve your workflow? Are there tasks where you explicitly avoid them? How often do you find bugs in the generated code (and a corollary, how frequently has a bug slipped in that was only caught later)?
For the most part it's not that I'm ceding the thinking to the machine; more often it suggests what I was going to type anyway, which if nothing else saves typing. It's especially helpful if I'm doing something repetitive.
Beyond that, it can save some cognitive load by auto completing a block of code that wouldn't have necessarily been very difficult, but that I would've had a stop to think about. E.g an API I'm not used to, or a nested loop or something.
The other big advantage that comes to mind is when I'm doing something I'm not familiar with, e.g. I recently started using Rust, and copilot has been a major help when I vaguely know what I _want_ to do but don't quite know how to get there on my own. I'm experienced enough to evaluate whether the suggested code does what I want. If there's anything in the output I don't understand, I can look it up.
> Are there tasks where you explicitly avoid them?
Not necessarily that I can think of, but after having copilot on for a little while it's gotten easier to tune it out when I'm not interested in its suggestions or when they're not helpful.
> How often do you find bugs in the generated code (and a corollary, how frequently has a bug slipped in that was only caught later)?
90% of the time I'm only accepting a single line of code at a time, so it's less a question of "is there a bug here" and more "does this do what I want or not?" Like, if I'm writing an email and gmail suggests an end to my sentence, I'm not worried about whether there's a bug in the sentence. If it's not what I want to say, I either don't take the suggestion, or I take the suggestion and modify it.
If I do accept larger chunks of suggested code, I review it thoroughly to the point where it's no longer "the AI's" code -- for all intents and purposes it's mine now. Like I said before, most of the time it's basically the code I was going to write anyway, I just got there faster.
At the current time it’s not that magic for me but more a small speed up with smarter auto complete.
In future iterations when it knows your whole code base, everything you see on the screen, your microservices and how they are connected and manipulated multiple files at the same time that’s when it would become more interesting to me.
Sometimes it helps when I’m writing code in a domain I don’t know. It can pull in a library function I wasn’t aware of for example.
It isn’t always right and sometimes hallucinates, but usually static analysis notifies me when the library function doesn’t exist or the signature is wrong, and then I have to go back and do the work I was going to have to do anyways.
The key I think is that most software isn’t writing unique code. We might write little nuggets of unique code and then glue it together with a ton of boilerplate. And LLMs are great at boilerplate.
This combination has meant that I’ve done in about 3 days what I thought would take me 2 weeks. And I’ve enjoyed the shit out of it too.
It's a better auto complete. When I'm writing markup it has surprisingly good suggestions for labels/placeholder/whatever.
> Are there tasks where you explicitly avoid them?
I don't use it to fix bugs/errors. Occasionally I try and see what it comes up with. it has never once successfully fixed anything in my entire history of using it.
> How often do you find bugs in the generated code (and a corollary, how frequently has a bug slipped in that was only caught later)?
Since it's just typing what I'd expect to type myself, I probably have the same bug rate as before. I haven't seen it insert off-by-one errors (yet). That's probably the most likely one I can imagine missing.
There are only so many axes along which improvements can be made in this domain, aren't there? What are the bottlenecks that, if solved, will produce a true breakthrough, exactly?
Doesn't the current approach have an upper limit that's inherent in the whole architecture, nay, even the whole foundational theoretical aspect of it?
Would love to hear from anyone who has come across one AI coding assistant that's obviously head and shoulders above everything else. I've tried Copilot, CodeWhisperer, and Ghostwriter.
• It runs locally.
• It is extremely fast.
• It is not overly aggressive. It only shows up sometimes and when it does I nearly always accept its suggestion.
• It's a native part of the IDE so doesn't interfere with autocompletion.
I gotta say, JetBrains nailed it with this one. Single line mode is a much better way to think about AI driven autocomplete. That said, it's also obviously much more limited. It helps and is pleasant, but isn't revolutionary.
The other AI assistant I use sometimes is aider.chat, it's an open source tool. You type what you want and it generates git commits for the requested change. This is clearly the direction to go in long term and is much more revolutionary, but there are still problems with it.
> What are the bottlenecks that, if solved, will produce a true breakthrough, exactly?
There's a lot of "low hanging fruit" here (not that low), because the big AI labs aren't focusing on code gen AI right now. I can think of at least 4 or 5 paper-sized research directions to make improvements. The challenge is that right now the only models any good at coding are GPT-4/Opus level models, so doing anything with them is impossible unless you're working at a tiny number of labs. I don't work there so the "obvious" ideas I have aren't of much use. That leaves explorations of whether you can take a great open source model and boost its coding skills a lot. I think you probably can, especially if you have the resources to do a Mixtral or a Llama 3 yourself. Most smaller labs seem to be focussing on general performance optimizations rather than trying to reach GPT-4 level skills so the number of labs capable of doing these experiments should go up with time.
From talking to people there's also a general fear I think that there's not much point working on some sorts of ideas because what if OpenAI come out with GPT-5 and everyone gets taught another Bitter Lesson (http://www.incompleteideas.net/IncIdeas/BitterLesson.html)? Better to just wait things out and see where model capabilities stabilize. From all the noise about licensing deals it's starting to sound like we're data constrained even for the companies that are pushing the boundaries of what fair use means, so maybe we'll see appetite to do smarter things with coding models next year.
May a new form of user interaction more tailored for product manager or product engineers would be the trend?
Anyway, I am building my own AI coding tool towards that direction with a new user experience focused on task instead of code: https://prompt.16x.engineer/
- tries to insert triple backquote in my code for some weird reason
- does not do continuous code review, highlighting likely errors and typos
- does not do any code edit or delete, only code insert
I want AI to work along with me. I feel that Copilot has huge potential, but this limited UI keeps that potential untapped.
If it matters: I don't use the chat features of any of these plugins. It's just not muscle memory, I use Copilot as fancy auto-complete and I open ChatGPT for longer or more detailed or specific code snippets. I also use Claude 3 Opus but I'll probably drop it, I just like ChatGPT more, the UI and the results.
That said, I agree with many of your points as things I would also like to have.
I'm not sure if you write code, and I don't want to assume either direction, but if you don't and/or don't use these tools, let me share my experience.
I now have a "sense" for when Copilot will have a good suggestion and I pause while writing code waiting for it to spit something out. Copilot earned this "respect", for lack of a better word, because it has shown me time and time again that it "knows" what I want. With Cody, I would reach a point where I wanted to wait for a suggestion, I'd wait, and I'd wait, then my focus is broken because I'm wondering why I'm not getting a suggestion, I look in the bottom status bar, and Cody has some red symbol on it. When I did get something back from Cody, I just felt the results were equal to or worse than Copilot. I only paid for Cody to get Claude 3 Opus and that's all I tried. In addition to bugs that caused Cody to not work, I had multiple plugin crashes reported to me by my IDE. Here [0] is one of the bugs I saw, it was the last one I got before I uninstalled Cody and canceled my Pro subscription.
The last thing I'll say is that the purchase/activation process was not good. I signed up for an account, installed the plugin, logged in, realized if I wanted Claude 3 Opus I'd need a "Pro" account, bought a Pro account, went back to the plugin and had to log out and back in for it to realize I was a pro member. A "check membership status" button/option would be nice. I looked in the preferences and everything on the sidebar panel I could click on. I also tried opening/closing the panel and some other things I thought might trigger an account update call, no dice. It's whatever and I did eventually get it working (by logging out/in) but I wouldn't want that to be the user experience for someone who just gave me money so I thought I'd mention it.
I can even imagine setting up a CI/CD pipeline with regression testing and let an AI try out ideas for a few days and seeing what it comes up with. What we have now, glorified autocomplete, is some weak tea.
I’ve found it mostly helpful for saving time on obvious boilerplate code, but the annoyances above plus the occasional inexplicable errors it introduces in said boilerplate code, I’ve just cancelled the entire thing.
Edit: Seems to be fixed now!
Care to clarify?
I just don’t see how it could think and write new code like a human developer would.
Any feedback from copilot users?
It’s the obvious application for their ridiculous speed. Quality of suggestions sure but waiting breaks your flow
The writers of this article don't even bother doing any introspection into these claims or ideas. Is it important to know that programmers have been using AI for the last 50 years? I'm sure it isn't.
Elsewhere, there’s a torrent of coding assistant startups: Magic, Tabnine, Codegen, Refact, TabbyML, Sweep, Laredo and Cognition (which reportedly just raised $175 million), to name a few. Harness and JetBrains, which developed the Kotlin programming language, recently released their own. So did Sentry (albeit with more of a cybersecurity bent).
Can they all — plus Augment now — do business harmoniously together?"
Well, we don't know!
We do know however that there are a lot of AI coding assistants, and there will probably be many more in the years to come...
The excerpted text above -- makes for a reasonably good list as to what's available as of the current date...
The other problem is they may have built a solution that was state of the art at the time but the SOTA has passed them by.
A lot of AI codegen/completion solutions were inspired by general transformer (Bert or GPT2) approaches without any type of chat tuning models.... and then OpenAI dropped ChatGPT. So what was a revolutionary demo in 2022 is now just a couple of CoT prompts in the latest openAI api call.
I'll be surprised if this one is better.
Now he's trying to get into the business of reducing software developer headcount and professionalism, through copyright-theft-laundering LLMs.