GitHub Copilot for Business is now available (opens in new tab)

cothrowaway7673y ago

It's an unpopular opinion which is why I'll cowardly write it under a throwaway account. Josh, I have a ton of respect for your work just btw. I can't help but see a headline like this and think "Okay the license argument has to be the top comment already".

To me this whole thing is like pandora's box and it will not in any way be put back into the box. In the long run isn't arguing about the code it generates and how it generates it mostly tilting at windmills? I've already met new / junior programmers that have used copilot and chatgpt to help them see how to approach certain problems or try to get better framing for what they couldn't quite get into the most accurate words to google.

I too would prefer these tools embody the ideal: no license violation, perfect citation of where the archetypes of the code came from. I've commented here today (amongst some great FOSS software engineers) to see if a genuine respectful conversation can be had about how just like torrents this one isn't going to be put back in the box no matter how many legal precedents attempt (or succeed) in cutting off heads of the hydra. It's utility seems like it will steamroll any attempts to stop or slow it down.

Am I wrong? Is it a fools errand to ask?

sublinear3y ago

> It's utility seems like it will steamroll any attempts to stop or slow it down.

What? I don't see any utility outside of education and even there it's pretty sketchy.

For business, legal compliance is not a joke and instantly shuts it down. The only businesses willing to use ChatGPT for generating code would be naive young startups who don't realize some assembly is still required and the instructions are missing no matter how much they query the bot. That's called expertise (which they don't yet have). It's not good enough to just write the code. Someone has to comprehend it so they can tweak it as needed. At some point the tweaks will become unwieldy and require actual software engineering that the bot doesn't know how to do (transform from one design pattern to another and know which to use). More power to them if they can cobble something together and then succeed at maintaining it. By the time they're through they'll have pulled off so many miracles that they won't need the bot anymore and become experts. That's quite the trial by fire, but hey everyone has to find their way!

JoshTriplett3y ago

I'm not saying "put it back in the box", I'm saying fix it to actually track Open Source licenses and provide attributions.

I'd have no objections to a tool that generated suggestions that came with attributions and license metadata, ready to insert into your project's file for third-party licenses. AI code suggestions are impressive.

I have objections to a tool that generates derived works from code without respecting the licenses of that code. For permissively licensed Open Source code, including that code without attribution deprives authors of their due credit (said credit often being how people get employment or funding). For copyleft Open Source code, including that code without using a compatible license violates the conditions upon which people made that code available for others to build upon and share. For proprietary code, including that code at all incurs legal risks.

Benjamin_Dobell3y ago

I can understand if people don't agree that it's copyright infringement, and will use these tools on that basis. However, rolling over on issues you're passionate about because they're difficult to address? Well, if everyone was like that, nothing would ever change.

Gigachad3y ago

Agreed. Copyright is not a fundamental law of physics, its something we invented to help incentivize creation. The moment AI tools show to help spur creation, they are now more useful than copyright so society will simply rewrite the laws to adapt.

wahnfrieden3y ago

OpenAI/GPT products at MS making same exact bet

unxdfa3y ago

Yeah we were told not to use it by the lawyers at work and have an official policy against using it. Not having that would open us up for liability if we’re sued as there’s no defence that what we did was clean room if we admitted using it.

We’ll hang back until other companies have litigated their way to some legislation around it.

hashtag-til3y ago

Same here. Told to steer clear of this (in fairness, not only CoPilot, also ChatGPT and stuff), until somebody else pipeclean this in the courts.

No matter what is said, there are no license guarantees on the generated code, as you don’t know the exact provenance, so it seems only sensible to be on the safe side.

fire3y ago

I don't understand why they aren't tagging data with license information and allowing users to use models that don't include certain licenses - seems like it would be the middle ground given the stance they've taken; like, "we don't think it's a problem, but if this makes you feel better you can use these other models that specifically don't train on gpl code, or whatever"

I would prefer to see full license attributions included in generated responses, though. Something that then also wouldn't be that difficult to generate a licenses file from?

Amazon's CodeWhisperer has a "reference tracker" that tells you the license of training data code if the generated response is within some similarity threshold, but that's still not good enough imo.

JoshTriplett3y ago

> I would prefer to see full license attributions included in generated responses, though. Something that then also wouldn't be that difficult to generate a licenses file from?

Exactly. By all means build tools like this, but build them to actually comply with Open Source licenses. Provide a list of the licenses you don't mind copying from, and get back attributions with your suggestions.

LelouBil3y ago

> Amazon's CodeWhisperer has a "reference tracker" that tells you the license of training data code if the generated response is within some similarity threshold, but that's still not good enough imo.

I don't think it's possible to do better than that with this technology.

Kiro3y ago

You make it sound like Copilot is just copy-pasting something from a single repo. The code Copilot generates for me is extremely specific to my application. It understands the context, my code style and what I'm trying to do.

The result looks like my own code and is utilizing the already existing parts of my application. The code it writes for me solves problems that you cannot find a standard solution for anywhere and is definitely not something that could be attributed.

How Copilot is trained is an issue but answering the question "where did this code come from and what license is it under" would be impossible.

JoshTriplett3y ago

> You make it sound like Copilot is just copy-pasting something from a single repo.

Not at all. I'm saying it's derived from large amounts of code, without respecting the licenses on that code.

> How Copilot is trained is an issue but answering the question "where did this code come from and what license is it under" would be impossible.

Then it shouldn't exist outside of demos of what could exist in the future if the showstopper legal problem gets solved. Let's get people treating that constraint as business-critical and start coming up with clever solutions, and see how long "impossible" lasts.

Gigachad3y ago

I'd like to see people try to cite the sources for the code they wrote. It's highly improbable that without looking at anyone else's work in their lives, they would have created anything remotely similar to what they produced.

ryan_lane3y ago

I open source my code specifically so that it can be re-used, ideally even for cases like this. To me, software freedom is the ability for it to be used for effectively any useful purpose, so that others don't have to do the same work again.

I understand that some folks don't believe the same thing, and use copyleft licenses so that their code can't be re-used in a closed way, and that's fair. Github shouldn't be training their product on copyleft licenses.

It's fair to call out its misuse of certain licenses, but "the equivalent of money laundering for violation of Open Source licenses" is simply inaccurate, as many licenses allow this type of re-use explicitly.

JoshTriplett3y ago

Do you license your code under a license that doesn't require any form of attribution or preservation of copyright notices? (Even permissive Open Source licenses typically do require that.)

If you do, then by all means they're welcome to use it without attribution or preservation of copyright notices, per the terms of the license you used.

But for all the Open Source code, even permissive Open Source code, that does require attribution or preservation of copyright notices, that's still a license violation. People don't often think of permissive Open Source licenses as something that can be violated, but they absolutely can be.

danuker3y ago

Indeed, that is why I don't use it either.

Double-checking whether the generated part is a verbatim copy negates the speed advantage.

Possible infringements from similarity are even harder to search.

filereaper3y ago

Yes, thank you this is exactly what I was looking out for in the announcement.

Was looking for a way to instruct CodePilot to abide by the following rules:

- Only use Apache v2, MIT or BSD licensed work for its recommendations. (Or a specific license set)

- Only use code trained on public repositories.

- Provide code attributions of the source code where the recommendations originate from.

I'm not sure if the last point is possible given these GPT type architectures but it would really help during code reviews.

jenadine3y ago

> Only use Apache v2, MIT or BSD licensed

Even if you use code under these license, you are still supposed to credit the authors by reproducing the license. So you need to know where it came from. Do you credit all the software the model was trained on?

tick_tock_tick3y ago

> This tool remains the equivalent of money laundering for violation of Open Source licenses

That's what a good chunk of people do anyway at work. No one really cares nor will care. We were already moving in that direction anyway this will just accelerate it.

aunch3y ago· 10 in thread

if you want an actual enterprise solution with in-customer-tenant/on-prem hosting, check out Codeium (https://www.codeium.com/enterprise)

disclaimer: i'm from the Codeium team. but really, we will even ship you a physical box if that level of data security is important to you

youssefabdelm3y ago

I tried using the playground code completion to ask it to write a script for pyautocad that colors in a grid, it completed with a different library, pyautogui. Even after saying "import pyautocad" myself, the function it completed was pyautogui.

I'm sure it's prob still very useful for people who care about the privacy tradeoff, but I've had more success with ChatGPT

aunch3y ago

Thanks for the feedback! Yeah the playground is a bit limited as it is for demo purposes.

We are fans of ChatGPT and think that ChatGPT is pretty complementary to tools like Copilot and Codeium. ChatGPT is helpful for longer form exploratory questions from natural language while Codeium in its current form is great to accelerate your coding.

BaculumMeumEst3y ago

you’re going to need to ship that emacs extension if you want to keep advertising on HN :-)

aunch3y ago

haha we'll have it out polished in the next week or so :P

hathawsh3y ago

I looked around your web site and I thought about trying out your product, but one feeling never stopped nagging me: even though I'm not in a large organization, I need absolute assurance that the AI is trained only on our code and permissively licensed open source software (like MIT or BSD.) Also, whenever it uses permissively licensed code, I need a complete list of everything it based its work upon so I can declare the relevant licenses.

Without that, I can't even entertain the idea of using an AI code tool for anything but private projects that I don't share with anyone.

JoshTriplett3y ago

Exactly this. Also, even the "our code" case if not done carefully may copy code from one or more internal projects that had in turn copied with attribution from an Open Source project, and fail to propagate the attribution.

cloudking3y ago

What model do you use? CodeGen?

aunch3y ago

our own!

MrZander3y ago

Are there plans to support the full Visual Studio IDE?

Edit: Also, Notepad++ support would be awesome

aunch3y ago

VS is on the roadmap, Notepad++ isn't something currently on the roadmap, but we'd be totally open if someone wants to write an open source plugin for it, just like we did for Vim/Neovim (https://github.com/Exafunction/codeium.vim) and are planning on doing with Emacs!

breckenedge3y ago· 8 in thread

An extra $9/mo for:

* Simple license management

* Organization-wide policy management

* Industry-leading private

* Corporate proxy support

Wow. Who’s going to pay a 90% premium for these features?

Edit: OK seems like different marketing pages have different features. The list above comes from https://github.com/features/copilot/. Still seems like a very steep increase over the base. And I cannot believe there are only 400ish companies using copilot.

Aeolun3y ago

The price difference is mostly irrelevant to large corporations. The just need that license management.

tccole3y ago

9 bucks per developer isn’t that much. I getting a developer to be 5 percent faster is a huge gain for just 9 dollars.

paxys3y ago

Sure, but the harder part is measuring whether the developer is actually 5% faster. Otherwise you can make the same case for every $10/mo subscription service in the world, and so we should all be operating at infinite efficiency.

dx0343y ago

Certainly for SV salaries. But $19/month per developer in addition to already existing cost can make a difference for regions with lower salaries.

Spooky233y ago

If you want to make a conversation awkward, ask your account team about the indemnification for the AI’s potential copyright violations.

WithinReason3y ago

"Notwithstanding any other language in your Agreement, GitHub will defend you against any claim by an unaffiliated third-party that your use of GitHub Copilot misappropriated a trade secret or directly infringes a patent, copyright, trademark, or other intellectual property right of a third party, up to the greater of $500,000.00 USD or the total amount paid to GitHub for the use of GitHub Copilot during the 12 months preceding the claim."

From here: https://github.com/customer-terms/github-copilot-product-spe...

https://docs.github.com/en/copilot/configuring-github-copilo...

ch4s33y ago

People with corporate compliance departments.

das_keyboard3y ago

I think the privacy part would be a big part for some organizations, even if I do not know what this really means or what this implies for the other plans.

rectang3y ago· 7 in thread

Is Copilot HIPAA compliant? It sends data to the cloud, so if you paste PHI…

lelandfe3y ago

Not an answer to your overall question of compliance, but to the specific point:

> Copilot for Business does not retain any telemetry or Code Snippets Data.

the_duke3y ago

The key word being "retain".

It's probably still sent to their servers.

ilc3y ago

And if that page changes without notifying you?

... Unless they are selling the compliance as a feature, be careful.

ilc3y ago

Source: Ex Hospital IT.

I wouldn't risk it. It is too easy to write the wrong prompts and leak PHI.

ChatGPT:

"Write me a parser for this HL7 message..."

Copilot:

"Using this example message please write a parser for it..."

Yeah... If it was compliant, people would write those in a heartbeat.

Unless sold as HIPAA compliant, and the conditions of use for that compliance are known... don't trust it, for SAAS.

This is stuff covered in your yearly HIPAA briefing folks.

Realizing that many people still fundamentally misunderstand HIPAA and PHI:

1. First, you really only need to worry about the specifics of HIPAA if you are a "covered entity" under the law (primarily a hospital or other healthcare provider, or a health insurer), or if you have signed a BAA with another company (more on that below). There are all sorts of misunderstandings that you can't, for example, say something like "Jane couldn't make the meeting because she's out with the flu" at a company - that's not how it works. Unless you're a covered entity, you're under no obligation to keep PHI private under HIPAA.

2. If you do work at a HIPAA covered entity, it usually is made explicitly clear where patient data is or is not allowed. Even if GitHub Copilot were "HIPAA Compliant", unless they signed a HIPAA BAA (business associate agreement) with your company, it's still not OK to send them any PHI.

Point being, there are plenty of reasons to be worried about customer privacy and data security, but people like to bring up HIPAA rules in lots of situations where they simply don't apply.

Why on earth would you ever put any PHI information in source control?

rectang3y ago

I would not. I avoid pasting PHI into source controlled files altogether.

rwalle3y ago· 3 in thread

So the code suggestion comes from the same data trained for the public version, which could include GPL code or have other issues?

I doubt any company would use this in their production code. Internal tools, maybe.

theRealMe3y ago

Microsoft itself uses and suggests this internally for production code.

Maxious3y ago

This version specifically removes snippets that appear in open source code.

JoshTriplett3y ago

So instead of Open Source license violations for which it'd theoretically be possible to provide information to comply with those licenses, it instead only violates proprietary software licenses for which you shouldn't be using the code at all?

[1] https://en.wikipedia.org/wiki/GNU_Affero_General_Public_Lice...

commitpizza3y ago· 3 in thread

Yeah I'll pay for something that is just slightly better/faster than writing it myself but will practically speaking steal my code and give it to others that pay for the same product. /s

I'd say no thanks. I think programmers using Copilot is paying for something that'll hurt them in the long run for a tiny benefit in the short run.

I don't trust Microsoft, and neither should you.

dx0343y ago

Honestly, I don't think the code I write is so special that no one should ever see it. The whole product might be worth protecting, that's why the codebase isn't public. But the individual files/functions are not special in any way. And that'll be the case for 99%+ of all code.

yed3y ago

Their training data comes from repos, not from you using the tool.

commitpizza3y ago

Well yes, but I use repos just as any other dev does. Most of my own projects I have converted to Gitlab, but still some projects still exist on Github.

X-Istence3y ago· 2 in thread

Will Github indemnify users against potential copyright lawsuits related to the code it regurgitates?

jonstewart3y ago

I came here to post exactly this. My team has budgeted for CoPilot, but we’ll only pull the trigger if copyright liability is resolved. I think it’s hilarious Microsoft is doing this, of all companies; it’s like they’ve forgotten that big businesses are run by risk-averse general counsel offices (often for good reason).

I can’t even imagine the hilarity that would ensue if I went to my GC right now to ask permission with so much in limbo; it’d be suicide by conference call.

Kiro3y ago

It doesn't regurgitate any code unless you bait it really hard. That's not how you use it. The only code it normally regurgitates is your own when it tries to autocomplete your boilerplate.

caditinpiscinam3y ago· 2 in thread

What's the benefit of using copilot over a package manager? Both help you reuse code that's already been written, but using packages gives you updates, explicit dependency tracking, documentation, etc.

LelouBil3y ago

I don't think you've ever tried copilot.

It's amazing at specific things, like being z context aware boilerplate generator or doing the scaffolding of an algorithm from a comment describing it.

That's really different than using libraries.

keithnz3y ago

it's not about reusing code, it's about generating code, ie, autocomplete, except the generator trys to recognize what you are trying to / most probably writing and suggests it. Great for boilerplate type code.

wg03y ago· 2 in thread

There was Kite also that was shutdown. Microsoft can maybe keep it as a vanity product but seriously - I'm curious what kind of teams pr developers are using CoPilot and how much more productive you feel?

whalesalad3y ago

I love it. I love that it learns my codebase, my style, etc... and truly does feel like a second brain at times.

I am using it within JS/Vue and Python and really enjoy it a lot. You can write a simple comment like "upsert the object in the store, and if the item is incomplete recursively call a refresh fn until it is complete" and it will "magically" do the rest - down to understanding where the objects I am referring to are in my store, the attribute used to decide this ('completed_at'), down to the correct syntax for updating the data in a way that plays nice with vue reactivity.

It's also a stellar autocompleter in Python-land. I have been using more and more type annotations in my codebases, but even without that, it will usually guess the right attribute or function name.

I also dig the way it will automatically write a docstring for a function. It can sometimes be a great debugging method since I can just have it comment an undocumented fn to quickly glean what it is doing. I'll circle back to enhance the docstring usually too so it is less cookie cutter.

For writing blog posts it is really neat too, because it will help me write functions to illustrate certain points based on what I am trying to teach in the post. Most of the time I can come up with the most succinct and relevant example, but sometimes I cannot and this tool does a good job helping there.

It's not perfect, but the fact that I can type a few words, hit tab, and then correct any mistakes is really a magic experience sometimes.

fortyseven3y ago

When it seems to KNOW where you're going with something and generates a surprise suggestion automatically that's exactly where I'm going with something, it gets real freaky and fun.

arduinomancer3y ago· 2 in thread

I’m curious if enterprise customers would have license concerns about the code produced from using this.

Have any big companies set policies on employees using these kind of tools? Do they allow them?

hashtag-til3y ago

One data point for you.

My company (big UK-based tech company) had an “all employees” sort of e-mail saying the use of Copilot, ChatGPT et al was not allowed for anything work-related or using company equipment due to unclear licensing model of the generated code.

I find many of these blanket rules silly, but in this case I find it is sensible to wait before polluting our products with this autogenerated code.

theRealMe3y ago

Fwiw Microsoft itself is betting on and suggesting devs use it for Microsoft production code.

moyix3y ago· 1 in thread

My prediction that they'd offer on-prem hosting of the models (for businesses with IP / secrecy concerns) turns out to be wrong! Seems like a weird choice, but maybe their hands are tied by OpenAI not wanting to lose control over the models?

IshKebab3y ago

More likely their hands are tied by not many businesses wanting to pay for a DGX A100 to run the models!

youssefabdelm3y ago

So is the new Codex model not available for individuals? That's what they seem to imply in the blog announcement but thats not a difference they highlight on their landing page between the two plans.

keithnz3y ago

Currently using codeium after they had an HN post not so long ago. Seems not too bad, though for C# its code generation is pretty poor, though apparently there is supposed to be improvements to the model soon.

IceHegel3y ago

How much better is the new model?

j / k navigate · click thread line to collapse

124 comments

78 comments · 14 top-level

JoshTriplett3y ago· 24 in thread

Sadly, "simple license management" here just refers to "who in your organization has a license to use this tool", rather than "where did this code come from and what license is it under".

This tool remains the equivalent of money laundering for violation of Open Source licenses (or software licenses in general).

thesuperbigfrog3y ago

Where's the part of the Copilot EULA that indemifies users against copyright infringement for the generated code?

Without such assurances, how do you know that the generated code is not subject to copyright and what license the generated code is under?

Are you comfortable risking your company's IP by unknowingly using AGPLv3-licensed code [1] that Copilot "generated" in your company's products?

Gigachad3y ago

6 more replies

Maxious3y ago

Copilot for Business includes an $500k indemnity if you turn on excluding suggestions that match open source code https://github.com/customer-terms/github-copilot-product-spe...

3 more replies

charcircuit3y ago

>Are you comfortable risking your company's IP by unknowingly using AGPLv3-licensed code that Copilot "generated" in your company's products?

This would not risk your company's IP.

cothrowaway7673y ago

Am I wrong? Is it a fools errand to ask?

sublinear3y ago

> It's utility seems like it will steamroll any attempts to stop or slow it down.

What? I don't see any utility outside of education and even there it's pretty sketchy.

JoshTriplett3y ago

I'm not saying "put it back in the box", I'm saying fix it to actually track Open Source licenses and provide attributions.

Benjamin_Dobell3y ago

Gigachad3y ago

wahnfrieden3y ago

OpenAI/GPT products at MS making same exact bet

unxdfa3y ago

We’ll hang back until other companies have litigated their way to some legislation around it.

hashtag-til3y ago

Same here. Told to steer clear of this (in fairness, not only CoPilot, also ChatGPT and stuff), until somebody else pipeclean this in the courts.

No matter what is said, there are no license guarantees on the generated code, as you don’t know the exact provenance, so it seems only sensible to be on the safe side.

fire3y ago

I would prefer to see full license attributions included in generated responses, though. Something that then also wouldn't be that difficult to generate a licenses file from?

Amazon's CodeWhisperer has a "reference tracker" that tells you the license of training data code if the generated response is within some similarity threshold, but that's still not good enough imo.

JoshTriplett3y ago

> I would prefer to see full license attributions included in generated responses, though. Something that then also wouldn't be that difficult to generate a licenses file from?

LelouBil3y ago

I don't think it's possible to do better than that with this technology.

Kiro3y ago

How Copilot is trained is an issue but answering the question "where did this code come from and what license is it under" would be impossible.

JoshTriplett3y ago

> You make it sound like Copilot is just copy-pasting something from a single repo.

Not at all. I'm saying it's derived from large amounts of code, without respecting the licenses on that code.

> How Copilot is trained is an issue but answering the question "where did this code come from and what license is it under" would be impossible.

Gigachad3y ago

ryan_lane3y ago

JoshTriplett3y ago

Do you license your code under a license that doesn't require any form of attribution or preservation of copyright notices? (Even permissive Open Source licenses typically do require that.)

If you do, then by all means they're welcome to use it without attribution or preservation of copyright notices, per the terms of the license you used.

danuker3y ago

Indeed, that is why I don't use it either.

Double-checking whether the generated part is a verbatim copy negates the speed advantage.

Possible infringements from similarity are even harder to search.

filereaper3y ago

Yes, thank you this is exactly what I was looking out for in the announcement.

Was looking for a way to instruct CodePilot to abide by the following rules:

- Only use Apache v2, MIT or BSD licensed work for its recommendations. (Or a specific license set)

- Only use code trained on public repositories.

- Provide code attributions of the source code where the recommendations originate from.

I'm not sure if the last point is possible given these GPT type architectures but it would really help during code reviews.

jenadine3y ago

> Only use Apache v2, MIT or BSD licensed

tick_tock_tick3y ago

> This tool remains the equivalent of money laundering for violation of Open Source licenses

That's what a good chunk of people do anyway at work. No one really cares nor will care. We were already moving in that direction anyway this will just accelerate it.

aunch3y ago· 10 in thread

if you want an actual enterprise solution with in-customer-tenant/on-prem hosting, check out Codeium (https://www.codeium.com/enterprise)

disclaimer: i'm from the Codeium team. but really, we will even ship you a physical box if that level of data security is important to you

youssefabdelm3y ago

I'm sure it's prob still very useful for people who care about the privacy tradeoff, but I've had more success with ChatGPT

aunch3y ago

Thanks for the feedback! Yeah the playground is a bit limited as it is for demo purposes.

BaculumMeumEst3y ago

you’re going to need to ship that emacs extension if you want to keep advertising on HN :-)

aunch3y ago

haha we'll have it out polished in the next week or so :P

hathawsh3y ago

Without that, I can't even entertain the idea of using an AI code tool for anything but private projects that I don't share with anyone.

JoshTriplett3y ago

cloudking3y ago

What model do you use? CodeGen?

aunch3y ago

our own!

MrZander3y ago

Are there plans to support the full Visual Studio IDE?

Edit: Also, Notepad++ support would be awesome

aunch3y ago

breckenedge3y ago· 8 in thread

An extra $9/mo for:

* Simple license management

* Organization-wide policy management

* Industry-leading private

* Corporate proxy support

Wow. Who’s going to pay a 90% premium for these features?

Aeolun3y ago

The price difference is mostly irrelevant to large corporations. The just need that license management.

tccole3y ago

9 bucks per developer isn’t that much. I getting a developer to be 5 percent faster is a huge gain for just 9 dollars.

paxys3y ago

dx0343y ago

Certainly for SV salaries. But $19/month per developer in addition to already existing cost can make a difference for regions with lower salaries.

Spooky233y ago

If you want to make a conversation awkward, ask your account team about the indemnification for the AI’s potential copyright violations.

WithinReason3y ago

From here: https://github.com/customer-terms/github-copilot-product-spe...

https://docs.github.com/en/copilot/configuring-github-copilo...

ch4s33y ago

People with corporate compliance departments.

das_keyboard3y ago

I think the privacy part would be a big part for some organizations, even if I do not know what this really means or what this implies for the other plans.

rectang3y ago· 7 in thread

Is Copilot HIPAA compliant? It sends data to the cloud, so if you paste PHI…

lelandfe3y ago

Not an answer to your overall question of compliance, but to the specific point:

> Copilot for Business does not retain any telemetry or Code Snippets Data.

the_duke3y ago

The key word being "retain".

It's probably still sent to their servers.

ilc3y ago

And if that page changes without notifying you?

... Unless they are selling the compliance as a feature, be careful.

ilc3y ago

Source: Ex Hospital IT.

I wouldn't risk it. It is too easy to write the wrong prompts and leak PHI.

ChatGPT:

"Write me a parser for this HL7 message..."

Copilot:

"Using this example message please write a parser for it..."

Yeah... If it was compliant, people would write those in a heartbeat.

Unless sold as HIPAA compliant, and the conditions of use for that compliance are known... don't trust it, for SAAS.

This is stuff covered in your yearly HIPAA briefing folks.

Realizing that many people still fundamentally misunderstand HIPAA and PHI:

Point being, there are plenty of reasons to be worried about customer privacy and data security, but people like to bring up HIPAA rules in lots of situations where they simply don't apply.

Why on earth would you ever put any PHI information in source control?

rectang3y ago

I would not. I avoid pasting PHI into source controlled files altogether.

rwalle3y ago· 3 in thread

So the code suggestion comes from the same data trained for the public version, which could include GPL code or have other issues?

I doubt any company would use this in their production code. Internal tools, maybe.

theRealMe3y ago

Microsoft itself uses and suggests this internally for production code.

Maxious3y ago

This version specifically removes snippets that appear in open source code.

JoshTriplett3y ago