First, Copilot supports a lot more languages (which is a big utility of such tools, that you are writing code in a different language much more quickly.)
Second, it fails more often with incorrect suggestions, and on non trivial things tends often to go line by line.
Today, we’re excited to announce the general availability of Amazon CodeWhisperer for Python, Java, JavaScript, TypeScript, and C#—plus ten new languages, including Go, Kotlin, Rust, PHP, and SQL. CodeWhisperer can be accessed from IDEs such as VS Code, IntelliJ IDEA, AWS Cloud9, and many more via the AWS Toolkit IDE extensions.
It was so unexpected for me that I had to pause for a second to process what happened.
Definitely disappointing compared to ChatGPT based code creation. I love describing what I want very briefly and getting a nice block of code to start tweaking.
I wish there was an easy way to benchmark these tools and revisit them when they pass a threshold of competence.
It's not necessarily the case, it can generate whole functions and even multiple functions.
Today I made a class called "DynamoUtils" and it suggested 2 full methods.
With copilot being embedded in all office software in the near future, MS may as well make GH copilot free. Interesting times!
While CodeWhisperer offers a free tier which may help individuals or pressure Copilot to lower personal account priced, AWS hasn't priced this very competitively for enterprise while their tool is still performing worse.
Does these language model bots help a little? Sure! But my worry of being replaced is currently sitting at like a 3% out of 100%. I expect to still have a job right up until we have AGIs, and quite probably for long after, as not everyone will be able to afford them. That is assuming we have any meaningful control over them.
For years we've been saying "computer time is cheaper than developer time."
Well, that's about to come back to bite us, in a big way.
But the real world violates this all the time. You want to buy a car. Some company you've never heard of in China makes the chips that detect whether or not your windshield wiper fluid reservoir has fluid. A shipment to the car manufacturer is ready to go out. But, there are no shipping containers. Until the windshield wiper sensor chips arrive, the car factory can't make any cars, and don't have room to unpack the shipping containers with unneeded parts that are piled up outside. So there is no container that can go back to China to bring the chips to the factory. While all that is worked out, SV venture capitalists print some money to give to a used car startup, making it super easy to get the best price on your used car. With no new cars available and flashy discounts to get the market kickstarted, the used car market shoots up, meaning that even though you want a new $60,000 electric car, all you can do is buy a used 1988 Yugo for $150,000. You walk to work, even though you have the money for the car you want.
If it's software, this is what we call a pageable event and the postmortem whines about "separation of concerns". But in the real world... well, we don't have those. We LOVE thinking we do, but when shit blows up, it's clear that we don't. So are we really surprised that software works the same way? It's how the Universe works, not bad architecture. The Universe has terrible architecture. Adjust some of those constants and try again!
and then you can just evaluate expressions within the function. The fancy way with editor support is: https://github.com/vvvvalvalval/scope-capture-nrepl
you make snapshots of the local variables at any point, and later evaluate code in the context of that snapshot. So you do some action in your program that results in that function being called, it'll save the input, you select that snapshot, and now you evaluate in the context of those function arguments as you edit and eval expressions in the function. And while clojure supports interactive development at a level beyond other mainstream languages, Smalltalk and Common Lisp have support for it on another level, for example: https://malisper.me/category/debugging-common-lisp/
There's some study where Smalltalk came out as the most productive language, I don't know whether it's more productive but that kind of interactive development where you build up your program evaluating it the whole time, without ever restarting, is a lot of fun. Why it went out of style I don't know
But if you're willing to do without the "as you edit" requirement, then what you're left with is a plain old breakpoint debugger. Certainly, there are many IDEs that have those builtin.
> /*Create a lambda function that stores the body of the SQS message into a hash key of a DynamoDB table.
Now, obviously that is not valid Java syntax and javac will fail on that, but could/would it be possible to just build an intermediate tool that'll expand this into Java (or whatever other language) so that you don't need to even see the expanded code in your editor, like the same way you don't need to see bytecode?
I get that practically, right now, that would be ill-advised since the AI may not be reliable enough and there are probably more cases than not where you need to tweak or add some logic specific to your domain, etc. But still, theoretically is that where we are heading, i.e. a world in which even what are now considered high level langs get shoved down further below and are considered internal/low level details?
One step before this (AI as a pre processor that generates source code which is then validated by tests and committed without even review) I think is possible.
Cutting edge LLM apps utilize multiple LLMs to perform validation, task decomposition, etc. it’s not a stretch that a future application can take your pseudo code / spec, maybe ask you some clarifying questions, generate a bunch of code and test cases, maybe even launch a beta stage and prompt you to validate it.
As others have mentioned, LLMs are nondeterministic and can do the wrong thing on a given run. This is in contrast to a traditional program that is either buggy or bug free. OTOH another LLM can be trained to validate, and to debug.
There’s a lot of work to do before LLM apps are considered reliable enough to do their job without intense supervision.
COBOL was designed for normal business people use, remember?
You'll just have to program in AI understandable language, I'm sure there are going to be lots of quirks and tricks similar to the languages today.
These systems are non-deterministic by nature, so I doubt it unless something fundamentally changes. Moreover you'd have to be super specific to capture the business logic to the point that you're basically writing code in a high level dynamic language anyway.
Yes.
But... it'll expand it based on the probability of what you want looking like other things it's been trained on. If you want the obvious use case then it'll be magical. Just describe the code and it'll work. But as soon as you want anything slightly less than typical you'll need to start 'prompt engineering' to refine in greater and greater detail, possibly until you've actually put in more effort than it'd take to just write the code.
For anything that's even further outside of the training data it won't work but it might look like it does. In the short term that's going to trip a lot of people up.
The worst part will be when non-developers start to use it though. "Make me a web form that takes a name, email address, and ZIP code and saves them to Airtable" will probably work eventually ... but with no validation, no error handling, no security, no styling, no cross-browser testing... because the author didn't know to ask for those things in their prompt. AI derived apps are going to suck.
It's funny, I was actually just pondering what to do with it when I opened HN and came across your comment. I was thinking of improving it some more and then selling it for a low-ish price. One thing that'd really help though, is a more widely accessible GPT-4 API.
"To help you code responsibly, CodeWhisperer filters out code suggestions that might be considered biased or unfair, and it’s the only coding companion that can filter or flag code suggestions that may resemble particular open-source training data."
It would be interesting if AWS actually does their attribution - how do they know which open source code was published in any public repo?
// Send string via mqtt
// use async_std::task;
// use async_std::prelude::;
// use async_std::net::TcpStream;
// use async_std::io::prelude::;
// use async_std::io;
// use async_std::sync::Mutex;
// use async_std::sync::Arc;
I also haven’t used these tools at all so if CodeWhisperer is a little “dumber” than copilot, I doubt I will even notice.
I was just thinking about this before reading the announcement. Part of our work is in aerospace; hardware and software being a part of that. All of it goes through layers-upon-layers of design, testing, verification and qualification for flight.
In my mind I saw this scenario where something happens and it ends-up in the courts. And then, in the process of ripping the code apart during the lawsuit, we come to a comment that changes it all. Something like this:
// Used Amazon CodeWhisperer to generate the framework of this state machine.
// Modified as needed. See comments.
That's when the courtroom goes quiet and one side thinks "Oh, shit!".What does the jury think?
They are not experts. All they heard is you just used AI to write part of the code for this device that may have been responsible for a horrific accident. Are their minds, at that point, primed for the prosecution to grab onto that and build it up to such a level that the jury becomes convinced a guilty verdict is warranted?
Don't know.
Does this mean we have to be very careful about using these tools, even if the code works? Does this mean we have to ban the use of these tools out of concerns for legal liability?
Personal example:
A year or so ago I wrote a CRC calculation program in ARM assembler. It could calculate anything from CRC-8 to CRC-32. This was needed because we were dealing with critical high speed communications and there was a finite real-time window to compute the CRC checksum. The code was optimized using every trick in the books, from decades of doing such work. Fast, accurate, did exactly what it was supposed to do. In production. Working just fine.
I was curious. A couple of weeks ago I asked ChatGPT to write a CRC-32 calculation routine given some constraints (buffer size, polynomial, etc.). It took a few seconds for it to generate the code. I ran it through some tests. It seemed to work just fine.
That's when the question first occurred to me: Would it expose us to liability if that code were to be used in our system? I don't know. I have a feeling it would be unwise to use any of it at all.
Wouldn't it be funny, interesting and perhaps even tragic if we had to have "100% organically-coded" disclaimers on our work in the future?
They already have alienated most of groceries and other online retailers from AWS. wouldn’t make sense for them to do this to others.
Amazon doesn’t have a lack of IP problem, they, like all large companies right now can’t turn ideas into products.
The goal is A) to get me to use it enough in personal projects that I convince my manager to pay for a business license, and B) encourage me to use more AWS API stuff (which CodeWhisperer is fine tuned on), where AWS makes the bulk of their revenue.
I have no qualms with either motivation.