Security starts with deep understanding.
Some standards and practices can help avoid some types of problems, and some are even rather effective (like airgapping your systems), but there isn't any way to assure security in general other than truly understand what you are doing.
**
I feel like Copilot is the wrong direction to optimize development. This is mostly going to help people with already poor understanding of what they are doing create even more crap.
For a good developer those low level, low engagement activities are not a problem (except maybe for learning stage where you actually want people engaged rather than copy/paste). What it does not help is the important parts of development -- defining domain of your problem, design good APIs and abstractions, understanding how everything works and fits together, understanding what your client needs, etc.
Also, I feel this is going to help increase complexity by making more copies of same structures throughout the codebase.
My working theory about this is this is going to hinder new developers even more than they already are by google and stack*. Every time you are giving new developers an easier way to copy paste code without understanding you are robbing them an opportunity to gain deeper understanding of what they are doing and in effect prevent them from learning and growing.
It is a little bit like giving answers to your kids homework without giving them chance to arrive at the answer or explaining anything about it.
**
Another way I feel this is going to hurt developers is competition in who can produce most volume of code.
I have already noticed this trend where developers (especially more junior but aspiring to advance) try to outcompete others by producing more code, close more tickets, etc. Right now it means skipping understanding of what is going on in favor of getting easy answers from the Internet.
These guys can produce huge amounts of code with relatively little actual engagement.
To management (especially with wrong incentives) this seems like a perfect worker, because management usually doesn't understand the connection between lack of engagement and planning at design/development time with their later problems (or they don't feel it is them that is going to pay the price).
The Copilot is probably going to make it even more difficult for people who want to do it the right way because even starker difference in false productivity measurements.
I've seen so much boilerplate in the Java or classic .NET Framework world, it's incredible. So many layers of DTOs, Request/Response Models and so on, that could be just generated. Or most of the time even removed completely (that would cost some "architects" their job though).
This is also true for a lot of Redux or Angular/NgRx applications. So much boilerplate, that you can't find the relevant code anymore.
Java is not the culprit here.
I think it is something that happened on the way that has something to do with J2EE and patterns craze we had a decade ago or two ago.
It doesn't help that frameworks like Spring and their documentation go out of their way to propagate these boilerplate-heavy patters.
Copying these lazy patterns is shortest, easiest way to get to working solution for a person that doesn't want to put any extra effort. And you can't get punished for doing this. Most developers don't even know there exist any other possibilities than mandatory controller calling service calling database layer and hordes of DTOs some people call "model".
The evil is that someone trained an AI on random text , not even with some AST, so you have garbage in so no surprise you get garbage out.
A true AI would understand that "the dev wants trough find all lines of text in a file that have this property", the AI just does "this code string is similar to this other code string using this `black box metric`"
I see more and more juniors pasting code or shellcommands from StackOverflow with careless ease, without even pretending anymore that they're interested in how it actually works.
A store in Vue 3 can basically be:
export default { state: readonly(state), ...setterFunctions }
It doesn't get more easy to read and streamlined than that.I still doubt that that's a result of DTOs.
I wonder if the way we are approaching it is wrong. We are basically putting text though a deep learning black box. The model might have learned some abstractions, but all in all it is just playing word games and trying to guess the most likely continuation of a string. Maybe we should go into the other direction and base such an AI on a really massive ontology. Instead of unstructured strings, put highly structured facts into the model.
For example, just like in Copilot you'd start with:
def login_user(username, password):
But the ontology would also know things like:- This is a web application and this function is going to be called after submitting a form
- Security specialist Bob says you should always hash your passwords
- Specialist Anne says you should use bcrypt
- Tom says Anne is 95% trustworthy
... and thousands of facts more. And then it would take them all into consideration, build a represenation of the problem you are trying to solve, find a strategy, and only in the end generate code.
I have a feeling that there was a qualitiative leap going from simple neural networks and multivariate methods to "deep learning" and modern machine learning, and that this is mainly driven by scale and available computing power. Now what if we try the same thing for ontologies, expert systems, and triple store databases? I think the difference will be between some AI parroting what it read on Wikipedia (direct speach), and a smarter AI being able to reason about what it read on Wikipedia (indirect speach).
There are already services that do this for you and I actually find them useful. For example, I might be trying to use a function from some library and it fails. If I get pointed to some public repositories that use the same library in function for similar purpose, I may learn that I am missing some critical setup. I can also browse different uses of this function/library and get informed on how it is at the very least used successfully by others.
https://en.wikipedia.org/wiki/Cyc
Supposedly an attempt to assemble a database of "common sense" facts and reasoning.
It has always been controversial and it's not clear what kind of success it's had.
https://en.wikipedia.org/wiki/Neats_and_scruffies
From the "Scruffy" side, there's Charles Rich's classic work on "Programmer's Apprentice".
https://dspace.mit.edu/handle/1721.1/6054
https://dspace.mit.edu/bitstream/handle/1721.1/6054/AIM-1004...
>The Programmer's Apprentice Project: A Research Overview
>MIT AI Lab Memo No. 1004, November 1987.
>Rich, Charles; Waters, Richard C.
>Abstract: The goal of the Programmer's Apprentice project is to develop a theory of how expert programmers analyze, synthesize, modify, explain, specify, verify, and document programs. This research goal overlaps both artificial intelligence and software engineering. From the viewpoint of artificial intelligence, we have chosen programming as a domain in which to study fundamental issues of knowledge representation and reasoning. From the viewpoint of software engineering, we seek to automate the programming process by applying techniques from artificial intelligence.
https://dspace.mit.edu/handle/1721.1/41967
https://dspace.mit.edu/bitstream/handle/1721.1/41967/AI_WP_1...
>Plan Recognition in a Programmer's Apprentice. Ph.D. Thesis proposal.
>MIT AI Lab Working Paper 147, May 1977.
>Rich, Charles
>Abstract: Brief Statement of the Problem: Stated most generally, the proposed research is concerned with understanding and representing the teleological structure of engineered devices. More specifically, I propose to study the teleological structure of computer programs written in LISP which perform a wide range of non-numerical computations. The major theoretical goal of the research is to further develop a formal representation for teleological structure, called plans, which will facilitate both the abstract description of particular programs, and the compilation of a library of programming expertise in the domain of non-numerical computation. Adequacy of the theory will be demonstrated by implementing a system (to eventually become part of a LISP Programmer's Apprentice) which will be able to recognize various plans in LISP programs written by human programmers and thereby generate cogent explanations of how the programs work, including the detection of some programming errors.
Copilot doesn't bypass peer review, code review, unit testing so on and so forth.
Amateur vs professional and novice vs expert are completely separate things.
You can be professional novice just as you can be expert amateur.
Now, the answer to your question is an obvious "NO". To be an expert you have to be a novice first.
The problem rather is "Are you making progress towards being an expert or are you just learning to more efficiently execute your novice workflow?"
> The way I see it, that happens because programming is still way more complex than it should be - and copilot will help with that.
No, it is just an illusion of help.
Just as your son may thank you for help when you give him an answer to his homework. From his point of view you have helped him, true, but from another point of view the point of the task wasn't to deliver answer to the teacher, it was to imprint something valuable on the mind of the child.
I love that software is an accessible discipline to hobbyists and that it empowers people. But it needs to be a discipline, top to bottom. We need deep understanding with security and robustness as fundamentals, good practices, and all of that baked into our tools.
Another parallel: language learning. You learn more by speaking and writing than merely reading and listening, because the former actually requires you to actively associate grammar rules to your physical actions, whereas consumption has a lower bar of effort since you can infer things from context, gloss over things, etc.
Sometimes you don't need an expert to produce highly secure, highly optimized code.
Have you seen the crap that people buy at Walmart? The furniture is not heirloom furniture, the food is not a 3-star artisanal experience. Have you bought tools at Harbor Freight? They're not the lifetime companion of a tradesman, kept in wood boxes and wrapped in cosmoline after each use. But an awful lot of work gets done with them, common homeowner wisdom is if you need a tool, buy it at Harbor Freight, if you use it enough to wear it out spend 10x to buy a really good one, but most tools you'll only use once or twice.
At workplaces across the country right this minute there are human beings doing rote transcription from one application to another, copy-pasting if they're lucky. That's a waste of effort and intellectual potential, and a hodgepodge of Excel equations or a crappy bit of Copilot glue code could be just the ticket. Yes, if those become the business' secret sauce and sold to customers on the Internet, they ought to put some effort into doing it properly, but there's a ton of work that could be accomplished with low-quality code.
the difference is that your sofa isn't programmable and networked into every other appliance in your house underpinned by a general purpose computer rife for abuse.
Virtually every piece of software you install is an access point to your machine or your sensitive data. One isolated thing in the analog world breaks down, not a problem. One misconfigured password in a VPN client, and whoops part of your national oil infrastructure goes offline
https://www.reuters.com/business/colonial-pipeline-ceo-tells...
This is one for the ages.
I can't wait until we start seeing Copilot Natives devs, who had it enabled from the moment they first opened VSCode at their "become an engineer in 3 months" bootcamp.
> To management (especially with wrong incentives) this seems like a perfect worker, because management usually doesn't understand the connection between lack of engagement and planning at design/development time with their later problems (or they don't feel it is them that is going to pay the price).
That's something I really want my competitors to do. Honestly it makes finding stocks to short much easier (or poaching talent...)
This is how management is in most places I feel, especially when it comes to evaluating junior, and early senior engineers.
> This is mostly going to help people with already poor understanding of what they are doing create even more crap.
I can see how people who haven't used it at length might come to that conclusion, but my experience with it calls the "mostly" part into question. I'm sure there will be cases of that. But as someone who deeply understands my craft, I'm finding significant benefits.
> What it does not help is the important parts of development -- defining domain of your problem, design good APIs and abstractions, understanding how everything works and fits together, understanding what your client needs, etc.
Quite the contrary! The last time a new tool helped me with those parts as much was when I moved from C++ to Python in 1997. What I experienced in my C++ -> Python transition was that an enormous chunk of my brainpower could shift from language gymnastics to the problem domain. Copilot gives me a similar feeling. It frequently suggests exactly the 1-3 lines of code I was about to type and saves me 30-60 seconds (easily 20 minutes in a full day of coding). Much better than that, it lets my focus stay on better abstractions, APIs, etc.
> Also, I feel this is going to help increase complexity by making more copies of same structures throughout the codebase.
We, as engineers, are still responsible for what we produce. Any tool needs to be used with critical thought. Of course there will be those who don't think enough. And it might even make them look better in the short term. But that will be exposed in the medium to long term - `git blame` will point to them as the authors of problematic code and not Copilot. When such problems arise (or even better, before they arise), some of us who are more experienced need to step up and mentor less experienced folks so that they develop good habits.
A small sample of areas it's helping me...
When I decide that I want to use different representations internally and externally for some data in a class, I initialize the internal member variables. Part way through typing Python's `@property` decorator, it's suggesting the name of the property and exactly how to use the member variables to generate the external representation I want. Over half the time, it's exactly what I was about to type. Maybe a quarter of the time it's not and I just don't accept the suggestion (or do a quick edit). And 5-10% of the time it suggests an approach that is better than what I was thinking. And that's in a very simple use case.
In other scenarios, it often sets up my loops just as I want them. Sometimes it picks column major when I want row major. I just keep typing and as soon as it's clear I want row major, it's suggesting that. Again, occasionally it surprises me with something better - if I just use that one function I rarely have a need for, the inner loop melts away. Why didn't I think of that? Well, now "I" did. The code I'm producing with Copilot is better than the code I would have written without because I'm thinking as I use it.
Where it really saves me time / focus is when I have some tricky calculation or API call that isn't hard, but there's a bunch of little details to get right. One I did yesterday... lookup a value in a dict, but the key needs to be mapped through another dict. Between the original key, the two dicts, and the variable receiving the result there are four variable names, plus one more for the mapped key (to spread it across two statements for readability). Before typing anything, I paused for a second to get the names straight in my head. Before I finished my thought, it suggested the lines, I looked at it for a second to make sure it was right, laughed because it was, and hit tab. It wasn't a hard task, but it helped me stay focused on the bigger picture.
Most of the time this doesn't feel at all like boilerplate. It's picking up my variable names and properly using the data structures I setup in other parts of the code. There's a big misconception that it's just pasting snippets in. It feels very different from that in real usage. Also, it rewards good naming habits. In the example above, how did it know I wanted to map the key through that dict? `key_mapping` was in the variable name. Easier for others to read later and for Copilot to read now.
The system I'm building is definitely better designed because of Copilot. Not because Copilot did any of the design, but because it freed me up to focus on the design more. It will have downsides, but in experienced hands it can be a great tool. I'm not affiliated with Microsoft / Github / OpenAI in any way. I'm just doing better work because I'm using it and doing better work makes me feel good. When the time comes, I'll pay for Copilot out of my own pocket if my company doesn't pay for it.
It's a tragedy of the commons of a sort.
The tragedy, IMHO, is that AI models like this encourage centralizing decision making into a single black box (to the extent that external research then benefits the owner of the AI model rather than advancing public commons), whereas in pretty much every other aspect of life, we consider decentralization/redundancy of autonomy to be the solution to robustness problems.
I wish there were a “robots.txt” file for Git to disallow certain bots from training on anything I have written.
It’s simple. If you are concerned by this, don’t host your repositories on GitHub.
But as long as you give the public access to your code, they can study it and learn from it. Humans and machines.
If Copilot were to reproduce a larger part of, say, an MIT-licensed codebase or almost any other permissive licence, then they should legally provide attribution. I'm pretty sure that they don't even have an option to provide such specific attribution, which means that either they believe that the code copied from any one source is below the relevant threshold or they're just ignoring copyright.
Although judging from the results of this test it kind of seems like for a lot of accounts that's already happened.
If it’s just helping you crank out the same bad code more quickly, without learning anything in the process, that’s useful to know. Some people might still want a tool like that, I wouldn’t.
Like, if your average dev will produce insecure code in 80% of samples, then Copilot starts to look really good! But if its closer to 0.01% of code samples, then copilot looks more like an intriguing novelty, not to be brought too near serious work. Much like dippin dots in this regard.
Copilot shouldn't be able to generate code destined for prod without review any more than should any line of code written by a human.
Yeah how did they measure? Did static and dynamic analysis find design bugs too?
Maybe - as part of a Copilot-assisted DevSecOps workflow involving static and dynamic analysis run by GitHub Actions CI - create Issues with CWE "Common Weakness Enumeration" URLs from e.g. the CWE Top 25 in order to train the team, and Pull Requests to fix each issue?: https://cwe.mitre.org/top25/
Which bots send PRs?
The only time it really helped when I needed to create a named list of char codes.
When it comes to more complex code than checking the code of copilot takes the same time as writing it. 90% of the time I needed to correct copilot.
For me, tools like linters are way more helpful then. If I could only use ESLint or copilot, I would go 100% of the time with ESLint.
Whether that is better or not, I suppose, it depends.
Copilot only helps with boilerplate code which could be handled by good intellisense.
When it tries to generate a function from the function name it fails so hard that it is more in your way then helpful.
so far GitHub Copilot is more feasible as tool for humans doing code-coverage for its input code, "given enough eyeballs, all bugs are shallow" style. When a developer goes, "huh, Copilot generated insecure code, better report it to the original project it learned it from" - if only Copilot was able to link to the original project, it would all be great and useful.
1. How many times do people write insecure code when not using Copilot?
2. How many times do people write insecure code when using Copilot?
In any case, if Copilot can generate code as well as the average programmer without supervision, that means it can already take the job of 50% of programmers. A more useful metric though is how many programmers can a person using Copilot replace by having greater productivity?
Also, in how many programming jobs does security matter? In my job for example it doesn't matter at all.
I'd still not use it. But it's an impressive trick.
Nothing more. Nothing less.
Jesus Christ, please make them stop. Stop using AI as a buzzword.
Either you call both AI or you call neither AI.
(A previous version of the comment stated that it was tuned from GPT-3. This is incorrect; the simpler GPT was used for faster convergence.)
If you would pick any smaller company with a dev team, a freelancer or an agency, your chances of finding a developer who understands and upholds quality code is vastly reduced.
Not to mention a lot of beginners will just push their practice projects to GitHub and never look at it again. I'm also guilty of this, but I never realized Microsoft was training AI with this code. If Copilot is learning from these projects then I'd say the code it regurgitates is not average, but even below average.
It’s interacting with GCS to scan a bucket for an extension, load the data with pandas, and concat some dataframes. It’s something dumb but mildly finicky that’s going to eat up so much time I could be using for higher value work.
Copilot would be very welcome as I do this, instead of annoyingly going off to Google 3 different python libraries and getting it all to work nicely together.
I'm guessing the ranking features are based on the repo stats, contributor stats, etc. Even "good" contributors will make rookie mistakes in certain areas.
Interesting to imagine how GH will try to solve this issue.
https://arxiv.org/abs/2108.09293
previous discussion including comments from lead author:
You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completion
https://deepai.org/publication/you-autocomplete-me-poisoning...
https://edition.cnn.com/2020/09/27/tech/elon-musk-tesla-bill...
It's so transformative that people may allow it to circumvent licenses.