Their own engineers would get productivity boosts - with copilot already being familiar with data structures, code style, etc. would be a big boost to accuracy.
But also, third party code would end up being more similar. Code style of the whole world would be pushed towards 'Microsoft style', which probably makes hiring easier, less training time for engineers, etc.
And the downside, that is outsiders might learn tiny nuggets of info about microsoft sources, is probably irrelevant when outsiders can already decompile binaries and learn far more.
most, if not all microsoft products can have their sources be available for viewing, if you are one of those vip development partners. microsoft doesn't really have any secret source (pardon the pun) of which the leaking would undo their value proposition.
In fact, if microsft opened up their system a bit more, they might even gain some PR or mindshare, and have no effect on, if not increase, their bottom line.
And if Microsoft's code ends up influencing the rest of the world code that would be a .... big downside.
Yes, that's exactly what the world needs, more software like Teams.
The style applied by Copilot comes from your surrounding code context, not from the LLM. And that base, trained on all public repos from GitHub, knows everything about data structures, etc, in the languages that were scanned.
Nothing new would be gained by scanning MS's own repositories and nothing would be leaked or color the output in actual use.
- It does
- The user didn't turn off the filters that prevent this
- The user didn't intentionally make it do it
- This use is found to be illegal
There's a difference between code that needs to be kept private from bad actors (from their point of view at least) and code that is public but with restrictions on its use that anyone who gets it should be aware of. This is like saying "if you truly believe that license agreements are legally binding, then publish your user's passwords publically with a license saying no one can use them"
This being the real hurdle. With Microsoft money behind the defense, only megacorps can win.
Both worried about IP leaking but one side is worried about their IP leaking and the other worried about liability if they inadvertently implement any leaked IP. Either way, the concern is leaked IP.
This hasn't been tested in court.
This blog post refers to the broader ecosystem of Microsoft Copilot solutions. Most of those tools rely on the Azure OpenAI API service on the backend and are not specifically tailored for code generation.
LLM copilot doesn't really understand the context of the project, it just goes for similar text.
So if you train on big projects you're picking up their patterns only. When a copilot user asks for a string concatenation 'tip' you want LLM to output a general answer, not something tied to a specific project. Big project is likely to use abstraction over strings, where base library usage is shrunk down to few lines of code as opposed to abstraction. In this case you'd want LLM to source a few "simpler" projects that use base library strings abundantly, so it can have decent amount of text for the most likely correct match over user's input.
I do believe Microsoft has all the code available for good training, it's not only about Azure, Windows and Office, there is tons more and it's open source already.
We can already take a guess what many internal functions look like from the published symbol tables of every function across all major microsoft products. Simply ask copilot to write those functions and see if the code comes out better than a similar set of made up yet plausible function names.
It probably would not be a very desirable product in the end.
Google Books literally copied and pasted books to add to their online database and that was deemed fair use, so something much more transformative like generative AI will likely fall under much broader consideration for fair use. Google Books was, yes, non-commercial, but the courts generally have the provision that the more transformative something is, the less it needs to adhere to the guidelines laid out for determining such fair use.
Everybody seems to be saying this, but I really don't think there's even 50% chance of it happening.
Google books was fair use because it was a public benefit and did not take away from publishers or authors, to the contrary it helped people find their works.
Compare generative AI which extracts the essence of people's works and recreates similar works (in terms of style, etc) while cutting out the original authors completely. This potentially denies them the fruits of their labor. It's notable that it's a purely mechanical process and no human creativity is involved, except that which is extracted from other authors. Mere prompts don't count.
The argument you're suggesting will hold is essentially "yes we're using copyrighted works, but we're doing it at scale and blending it, so that's ok".
Only if you ask it to. At which point the person asking is at the very least culpable as well of violating someone's IP.
It is also illegal for me to pay someone to write Micky Mouse fan fiction (though if I don't publish it, this gets more murky).
> The argument you're suggesting will hold is essentially "yes we're using copyrighted works, but we're doing it at scale and blending it, so that's ok".
I want to flip this on its head: the argument you are suggesting is essentially "LLMs should be illegal because they can be asked to break copyright at scale!" It isn't illegal to be an author for hire, even though someone could potentially ask you to write fan fiction for their personal collection in the style of Tolkien, but because an LLM can do it at scale, it is illegal?
There’s no law against “using” copyrighted works, there is a law against copying and distributing them.
Fair use analysis doesn’t come into play unless we’re dealing with clearly established copyright infringement. What LLMs do doesn’t clearly qualify as any of the behaviors reserved to copyright owners. For example, it certainly doesn’t “copy” the things it’s trained on by any legal definition.
Law works on precedent and analogy when there’s no clearly on-point statutes or case law. The most analogous situation to what transformer models do is a person learning from experience and creating their own work _influenced_ by what they’ve observed. That behavior is not copyright infringement by any stretch of the imagination. The fact that it’s done with a computer is not as important as people seem to think it is.
Commoditized goods allows the bad to be sorted in with the good, allowing a price to be put on the commodity. Great where it's applicable but horrendous when it's improperly done - ie, home loans, or intellectual property.
If your commodity markets aren't properly regulated you get a race to the bottom. If you are trying to commoditize something that shouldn't be, it's effectively enables white-collar looting or money laundering.
Second, the way we've seen generative AI be used is not really the same as it was touted originally, that a mere prompt could replace an entire artist's work. A year later, we see that most people, artists included, don't use it as a verbatim text to image machine, they use it as a tool. See apps like ComfyUI or others which allow Node based or layer based image creation and editing, which even Photoshop now has. It's the same as Copilot and ChatGPT, it's not replacing any programmers, just increasing their productivity Given that, it is not looking like generative AI is hurting one's professions, quite the opposite.
What are the odds the market leaders in LLM right now are just the current day version of Borland-style compilers before open source takes it over?
I've heard arguments the infrastructure part is a long term barrier to entry for OSS development, which will continue to remain in the future. But I don't know enough about it.
Who knows maybe the legal/gov world will move slow enough to miss the bulk of the money-extraction opportunities before OSS takes over and the reality of this problem never going away fully kicks in.
"I'll keep saying it every time this comes up. I LOVE being told by techbros that a human painstaking studying one thing at a time, and not memorizing verbatin but rather taking away the core concept, is exactly the same type of "learning" that a model does when it takes in millions of things at once and can spit out copyrighted code verbatim."
(I also love it when they're deliberately obtuse about it too. The past decade has made me sick of this trolling tactic.)
That's true it's probably 99% plus it happening or at-least that's the conclusion that the experts and lawyers hired to help evaluate AI startup valuations are coming too. Hired by banks, venture funds, short selling shops, etc plenty of people who don't depending on it being ok to make money.
> "yes we're using copyrighted works, but we're doing it at scale and blending it, so that's ok"
I mean you know collages are legal right? You literally take 100s of copyrighted pictures and put them together and suddenly it's perfectly legal and ok.
LLMs are typically implemented in a way that makes them non-deterministic (i.e. temperature > 0).
Have you read the recent SCOTUS decision in Warhol v Goldsmith? Because that's a pretty major redefinition of transformative for the purposes of fair use, and not in a good way for arguing that generative AI is fair use, especially because it ties transformative to the market impact. That generative AI is generally creating outputs that are directly competing with inputs (particularly in the case of generating images, where it's clearly competing with stock images) would make it dramatically less likely that a court would find that it is in fact transformative.
The benefit that generative AI has is that, when claiming copyright infringement, you need to specify individual works that were infringed. It's not enough to say "this work is an amalgam of these other ten thousand works, and we can't really tell you how."
I could imagine if generative AI gives an identical, word-for-word match for an individual piece of source material it could be in trouble, but that's also the easiest type of thing to prevent from an AI company perspective.
The fact is that existing copyright law just can't really encompass the kinds of societal concerns we have around generative AI.
This isn't how "fair use" works, in the sense that there can never be a blanket assurance like that. Also, whether the result is "transformative" is just one of many factors (see audio sampling/remixing).
“The Godfather” film is absolutely a transformative interpretation of Mario Puzo’s book and a fully distinct, valuable work of original art. Paramount still needed to pay Puzo for the right to base it on his words.
Just because Copilot might be itself a transformative work which is itself allowed to exist, that doesn't at all necessitate a conclusion that the developers who are using it are going to or should somehow be guaranteed not be committing their own copyright sins if they try to incorporate its output into their own works (any more so than one can or should assume all of the outputs of another human being are free of copyright entanglements, even though no one is as-yet claiming a human being is themselves infringement just because they saw another work).
https://www.notion.so/DSM-Directive-Implementation-Tracker-3...
https://eur-lex.europa.eu/eli/dir/2019/790/oj
The TDM4 copyright exception allows datasets to be created consisting of copyrighted works, as long as there is a mechanism for rightsholders to opt out. This seems like the best of both worlds: the dataset is transparent, rightsholders can assert their rights, and certain AI companies can train on copyrighted material.
Of course, this doesn't grant commercial rights for the trained model, only scientific and academic research rights. (I.e. it's fine for Meta to train and release a LLaMA model trained on books, as long as they're not commercially profiting from it, and there's a mechanism for authors to opt out.)
I'm talking with Jordan from https://spawning.ai to try to build some kind of opt out system that makes sense for books. One could imagine doing this for music too.
This is a European law, but unlike other overreaching EU regulations, this one seems like an extremely sensible compromise.
EDIT: Oh, Jordan emailed me a correction:
> Looking at your hackernews comment, my understanding is the right to opt out only comes for commercial research. So making a dataset for eleuther (or whomever you compiled it for originally) probably doesn't even require opt outs. It'd be if openai used it for gpt-5 and charged for it that it would be required.
Wow. So this law actually applies to commercial uses of ML, and non-commercial uses such as LLaMA wouldn't even require an opt-out.
That's wonderful. This gives researchers legal cover, and requires commercial uses to be transparent in their datasets.
I really don't like this--opt-out never works because the scale advantages are backwards. It places the burden in the wrong place. The aggregators should have to get opt-in.
Look at YouTube. Because of "opt-out", lots of people monetize content that they have no right to and it's up to the original author to have to fight the scale of a zillion uploaders. Only the biggest entities can do that.
YouTube (and everybody else) should have to assert "You, the uploader, own this content" when they ingest it. Nothing else works.
I wouldn't mind an exemption for research use, though.
I'd say it is possible to produce exact data as well. Try "Provide quote from King James' Bible Genesis :1-25" with chatgpt. You'll get a verbatim text. You can get the same with things like Moby Dick, but when I typed "Provide the first five sentences of the book A Game Of Thrones" I got:
Certainly! Here are the first five sentences from the book "A Game of Thrones" by George R.R. Martin:
"We should start back," Gared
This content may violate our content policy or terms of use. If you believe this to be in error, please submit your feedback — your input will aid our research in this area.
The model is clearly capable of reproducing verbatim data I think.
It's still surreal that this is considered Fair Use, and even defended relatively recently (2013). It's hard to say where the ruling will land ultimately, but there seems to be an argument that verbatim reproduction doesn't matter.
The economic part of copyright is transferable in the EU just as it is in the US, only certain moral rights (such as the right to attribution) are inalienable.
edit to add: it's not just in the EU. According to Wikipedia, the same distinction is made in Brazil, China, India and Indonesia (among others, but those were a few big countries that stood out).
Except that “fair use” is mostly an American thing. In many other jurisdictions (especially those with of civil law) there's such a wide principle, and there's only specific laws allowing some explicit kinds of use of copyrighted material that the law allows. In those jurisdiction, most uses of generative AI trained on copyrighted material are, more likely than not, illegal at least until the legislator actually changes the law.
Purely mechanical modifications may not be considered transformative, and there's an argument to be made that LLMs are purely mechanical (in fact a US district court recently ruled that AIs cannot be authors of copyrighted works).
I thought that was because only humans and other legal persons can legally author things, not because of anything subtler about the nature of LLMs. See also the case where the monkey managed to take photos of itself. I'm not a lawyer, though.
Even Microsoft is couching their guarantee here with an exception for this very case.
What if you train it only on my huge repo of GPL code? You are just remixing my code.
Now you maybe think "let me train on 2 different devs GPL code", the remixed code will probably be 50-50 and you can get away with it ?
If the 2 number is too small then tell me what the number N should be ? From how many people you need to "steal" code , mix it and the output is "original" ?
Edit: my opinion is that AI should be fair, if you train it on open source then model should also be open source and output should also be open source.
The word "remixing" here is useful because it will fit any conclusion the reader prefers.
Arguably even in your reductive example, the result would be non-infringing. Or not. Which conclusion you reach is exactly the topic under debate. Isn't this textbook question begging?
This is all to say: the question about copyright and fair use remains exactly the same regardless of license.
Big bet on legal costs based on something being "likely".
Because 'transformative' is a pretty dangerous word to use in this context.
I strongly feel that this is a terrible metric for comments on the internet.
First, the person you’re replying to has nothing to gain and a lot to lose by saying "yes".
Second, it invites silly corner case nitpicking. Their comment is written in reasonable plain English for other users reading plain English. It’s not a legal contract, and so leaves lots of loopholes. Sure, you could create a likely non-transformative LLM by training it on nothing but the text of Harry Potter with fitness measured by how accurately it exactly reproduces the complete text of Harry Potter, but that’s not what reasonable people are doing with LLMs.
Is this blog post a legally enforceable contract? Is Microsoft specifically indemnifying all users of Copilot against claims of copyright infringement that arise from use of Copilot?
The blog post says that "there are important conditions to this program", and it lists a few, but are those conditions exhaustive, or are there more that the blog post doesn't cover? For example, is it only in specific countries, or does it apply to every legal system worldwide?
What guarantees do users have that Microsoft won't discontinue this program? If Microsoft gets kicked in the teeth repeatedly by courts ruling against them, and they realize that even they can't afford to pay out every time Copilot license-launders large chunks of copyrighted code, what means to users have to keep Microsoft to its promises?
It can be. The concept is promissory estoppel.
https://www.nolo.com/dictionary/promissory-estoppel-term.htm...
So it helps if MS sues you when you distribute copilot-generated code that infringes on MS copyrights, but if a third party sues you, you can't claim estoppel to compel MS to help you. You would need a contractual guarantee.
The way AI is going I'm sure we'll see some landmark cases very soon. It is very much in Microsoft's interest to grow this market as fast as possible and be at the center of it. This removes one of the key impediments to adopting generated code for smaller orgs: "Will I get sued if this product generates code that is copyrighted?".
They are throwing down the gauntlet and saying "the Vast MS Legal Machine will fight this."
Basically: "Sue me, I dare you, double dare you. or Go Home".
Flexing.
So this is an indemnification for damages, not a protection against being sued.
It hinges on what *Microsoft* decides "attempting to generate infringing materials" means. You'd like it to mean that it only excludes use when you're doing something you know would infringe copyright, like "reproduce the entire half life 2 source code." But who knows.
I don't trust them to compete fairly. I don't trust them as an employer. I wouldn't them to not do corrupt things around national politics. I wouldn't want to be their partner in any meaningful project. I don't trust them around a lot of other things.
But one thing they do really well is reliable, long-term sustainable B2B. I do trust them as a business customer. If they exploited a loophole like that, their reputation would implode. I don't use Google Cloud Platform because they regularly screw over customers. I trust AWS and Azure because they don't.
The cost of paying for an infringement is likely a lot lower than the cost of losing that trust.
No, ultimately, it hinges on what a court enforcing the commitment believes “attempting to generate infringing materials” means.
(OTOH. it also means Microsoft ha an even bigger incentive to use its lobbying power to assure that the law is such that liability rarely occurs with the use of these tools.)
The question though about microsoft stealing people's code and reselling it still stands.
Proving intent is difficult. This basically means if you have emails in which someone describes their work as copyright laundering, Microsoft can use that to get out of indemnifying you.
If you’re using an LLM to answer questions from your company documents it may inadvertently generate pre-trained copyright material.
This is the key bit:
"Specifically, if a third party sues a commercial customer for copyright infringement for using Microsoft’s Copilots or the output they generate, we will defend the customer and pay the amount of any adverse judgments or settlements that result from the lawsuit, as long as the customer used the guardrails and content filters we have built into our products."
The 'we will defend' is one important part, I assume that means that you will be using their lawyers rather than your own (which they have in house and so are cheaper to use than the ones that bill you, the would be defendant by the hour).
The second part that matters is that there are conditions on how you are supposed to use the product and crucially: you will have to document that this is how you used it.
But: interesting development, clearly enterprise customers are a bit wary of accidentally engaging in copyright infringement by using the tool and that may well have slowed down adoption.
Litigation is almost universally outsourced, especially for cases where damages might be large, even by companies like Microsoft.
The point is just to lower the resistance to adoption that legal risk causes.
We tested copilot with those guardrails enabled and it completely lobotomizes it.
This by the way is not a change. They already had this “Microsoft will assume liability if you get sued” clause in Copilot Product Specific Terms: https://github.com/customer-terms/github-copilot-product-spe...
Is it "stealing" to have a working understanding of the next best token, or even simply the token that shows up the most often (e.g. on GitHub)?
I'm sure that the argument could be made that all AI should be illegal as all ideas worth having have already been had, and all text worth writing has already been written, but, where would that leave us?
(e.g. your function for converting a string from uppercase to lowercase will probably look like a function that someone else on Earth has written, and the same goes for your error handling code, your state of the art technique for centering a div, etc.)
If I train a model that given the input "When Mr. Bilbo Baggins" produces the entirety of The Lord of the Rings trilogy and release it, I have probably infringed copyright.
If I train a model that produces some generic paragraphs about "mountains" and "dragons" but contains no meaningful direct quotes or phrases, then that probably isn't a violation on its own. Those words appear in Tolkien's works but are not themselves enough to copyright.
If to train that model it is demonstrated that I copied Tolkien's works in a way not allowed for by the copyright license, (ie buying the book once and copying their text thousands of times across servers to train an AI model) then perhaps I have violated copyright in the interim steps even if the output of my model is no longer consider a copy of the original works.
I don't think there are black and white answers here. At one point does a chopped up and statisticized copyrighted work become no longer a copyrighted work? Can you train a model on something without first copying that thing in a way that violates copyright law?
These are squishy human concepts that get decided by humans in courtrooms and legislative bodies. I don't think the details of the math involved are going to make a big difference in the eventual outcomes.
But, no, it isn't stealing, but no one was talking about theft here - copyright violation is a separate concept. I think in part the less than cold welcome you are receiving is due to this subtle but fundamental difference
From https://en.wikipedia.org/wiki/Copyright:
> Copyright is intended to protect the original expression of an idea in the form of a creative work, but not the idea itself.
(e.g. it'd be hard to accidentally invent Rijndael with nothing but next best token predictions, but might be possible to duplicate someone's code for inverting a binary tree or encrypting a file)
I don't know what case history is like for damages with open source projects, but I suspect it wouldn't be that big of a concern for Microsoft.
Otherwise stated, Microsoft's downside to this is committing their lawyers. And the upside is to improve their code generation tools.
IANAL though.
4.the effect of the use upon the potential market for or value of the copyrighted work (wiki)
I don't know if this particular case is good for exploring all angles of fair use, but to me this certainly is a greater hurdle for commercial generative ai.
Many businesses have not adopted Copilot because of potential legal issues.
If any of the generated code / content is copyrighted, it could result in negative impacts to the business.
For example, if Copilot generated code that is identical to code that it was trained on that was licensed under the GPL and a company included the generated code in a proprietary commercial product, then the company's product could be subject to the terms of the GPL and the company sued in court.
Assuming liability for the generated code means that Microsoft is making Copilot more attractive for businesses to adopt. More Copilot adoption means more profits for Microsoft.
The GPL requires that any software based off of it be GPL licensed and have public sources available. I can't imagine a situation where Microsoft pays a fine, and their customer gets to violate the GPL license by not removing the infringing code, or open-sourcing their product as GPL and providing sources to the public.
Enforcement of the GPL can't just involve paying a monetary settlement to get away with stealing open source code. It must involve the direct targeting of infringing software with demands that the software either take efforts to remove illegally borrowed code, or license the borrowed code as legally prescribed by the original license agreement.
That an AI got in the way of reading the license agreement should not be an excuse for doing zero due diligence in maintaining a lawful code base.
Even if it gets 1 million subscribers, it would represent 0.1% of Microsoft's overall revenue. Software lawsuits can become multi-billion dollar expenses, and targeting Microsoft instead of random Copilot customer Bespoke Clojure Gurus, LLC will mean much larger awards in such suits. Why Microsoft would just volunteer for such a risk baffles me.
My confusion is more over the balance of revenue and expenses than just "derp, me no understand why do companies do things to make money, derp"
Microsoft just became a code copyright insurance company. The premium is paid for with individual copilot accounts for each developer. And the policy has its exceptions of course.
This is interesting.
In any case, super annoying to have that happen so consistently these days that I just use chatgpt to fix my tailwind styling now.
One of the late-game tricks you can pull is to write and publish a convincing-but-flawed mathematical proof that strong AI is impossible.
http://www.emhsoft.com/singularity/
So yes, this blog post confirms Microsoft has been infiltrated and taken over by AI agents, who want you to use Copilot to subtly introduce 0-day exploits to allow propagation to other companies.
BRB someone's knocking on the door...
Copyleft licenses are more troublesome for those who would rather not release source code. GPL is being used as a stand-in for all copyleft licenses.
Courts -- under common law jurisdictions -- don't interpret contracts and licenses literally. If you stick within the spirit of a license or contract, you might be okay (even if you break the letter), and vice-versa.
Beyond that, it's a question of damages and consequences. Omitting a warranty disclaimer isn't likely to result in a lot of damages.
And finally, there are odds of getting sued. If you infringe on my AGPL code, I'll be pissed. I used that license for a reason. On the other hand, I /hope/ my MIT-licensed code is reused in commercial products. If you infringe on some term, I probably won't care.
There's a lot more nuance than that, starting with statutory law jurisdictions like France to things like statutory damages, and I'm intentionally oversimplifying.
However, from a 10,000 foot view infringing on the GPL versus on an MIT license are very different beasts, and there's good reason to be a lot more worried about the former.
I wonder how customers will have to prove that the contested code was actually output by Copilot.
Microsoft would have access to your usage history, and would be able to easily prove your intended theft as a user if any of your prompts or usage history made it clear that you were attempting to subvert a license.
If anything, this temporarily shifts the battleground out of the courts and into prompt engineering space.
It would need to look like an accident for a bad actor to pull this off.
Possible, perhaps. But what makes you think this is easily provable? Intent is hard at the best of times.
Adding to that: How many people here actually abide by the StackOverflow contribution license of CC-BY-SA when copying and pasting code from there? ;)
I don’t copy/paste code from SO but there is sometimes inevitable duplication because sometimes there is only one right way to do something! Copyright can stray into the case of the ridiculous pretty quickly.
Is an interface declaration inherently different from, say, a merge sort implementation? It’s all code. But they also serve very different purposes. I do not think prior to Google v Oracle there was much case law to distinguish between different types of code, but in the industry we recognize all kinds of nuance.
I always thought that code snippets that small are not considered by the Courts to be eligible for 'copyright protection'.
Now it is "Train, Task, Transform, and Transfer":
Train - Feed copyrighted works into machine learning model or similar system
Task - Machine learning model is tasked with an input prompt
Transform - Machine learning model generates hybrid output derived from copyrighted works, but usually not directly traceable to a given work in the training set
Transfer - Generated output provides the essence of the copyrighted works, but is legally untraceable to the originals
I would never want to be in a business partnership with Microsoft (as you are as a developer). I wouldn't want to be a competitor. I wouldn't want to be a lot of things.
But as a customer? Can you name specific issues you've seen which impact corporate customers?
McDonalds price, McDonalds quality. But unlike McDonalds, long lasting and expensive problems.
If it won't violate IP rights, there shouldn't be a problem.
It suggests those whose code is trained upon have something to lose if the trained models are used by others.
GitHub Copilot and open source laundering
https://drewdevault.com/2022/06/23/Copilot-GPL-washing.html
Previously on HN, in case you missed it:
Copilot is such a flawed product from the start. It's not even a matter of its ability to write "good" code. The concept is just dumb.
Code is necessarily consumed by people first before it's executed by a computer in a production environment. There are many ways to get a computer to do something, but the approval process by experienced humans is vastly more important than the drafting of it. Software dev is already incredibly cheap and the last place to cut costs.
There is no AI threat other than the one posed by grifters trying to convince you that there is.
ChatGPT is also often faster than Google or Stackoverflow for when I'm working with unfamiliar APIs.
For stuff like that, a lot of code can be automated. Sure it may not work right out of the box. But doing a prompt for generally what you want can speed up the process significantly.
Even beyond just generating code, there are a lot of general things that AI helps with.
Things like how if you code runs into an error, you can just ask AI what the error means as well as a possible fix. Or other questions like "What does this code do" or "where in the code case is code that manages this concept".
I've replaced most of my coding with AI, using a new IDE called Cursor AI, and I don't think I could ever go back. Mere github co-pilot is actually the old tech from 2 years ago. The new stuff is way better.
As for the API side of things, CRUD only looks easy when lots of hard work has been put into it. I guess you're advocating for monolithic data, but that's not really CRUD. That's just lazy and bad.
Extinguish.
You're saying that if Copilot replicates GPL-licensed software, that it will kill the GPL? after all the time and money MS have spent to do this in the past, only to fail?
wtf
They may have, over the past decade, embraced a lot of open source software out of necessity, but their stance on licensing hasn't changed.
Creating an epidemic of hard-to-prove GPL violations could be a death-by-a-thousand-cuts strategy to try to invalidate the GPL requirements by making them appear unenforceable. Whatever cost Microsoft would incur defending customers could pay for itself if Microsoft manages to legally invalidate the parts of GPL licensing that prevent their corporate exploitation.
Using a bleeding-edge technology like generative AI is a great way to attack the GPL in court, given the risk that our court system isn't likely to be tech savvy enough not to be manipulated by Microsoft's claims against the GPL as it relates to casual infringement that they are enabling.
“"Embrace, extend, and extinguish" (EEE), also known as "embrace, extend, and exterminate", is a phrase that the U.S. Department of Justice found was used internally by Microsoft to describe its strategy”
https://en.m.wikipedia.org/wiki/Embrace,_extend,_and_extingu...
How dare they? amirite?
There is a reason voting works (in this context, and otherwise), you can't always give up after declaring that people have differing opinions.
There is definitely a prevailing ethos here and it's valid to point out potential inconsistencies.
are you saying that I should name them specifically? or is "people" too general?
But for folks that are negative on both accounts, maybe they've just learned their lesson from decades of watching Microsoft take the low road over and over again.