story

Tencent WeChat is now a GitHub secret scanning partner (opens in new tab)

github.blog

154 points5amdotis3y ago141 comments

141 comments

dang3y ago

Lame corporate partnership announcements aren't on topic for HN, and the wording here looks to have been a boilerplate malfunction: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que....

Poor functionary creates political incident with humble template...sounds like a Gogol short story. "but it worked great for redirect.pizza!"

Btw I assume this recent thread was about the same feature:

Secret scanning is now available for free on public repositories - https://news.ycombinator.com/item?id=34007637 - Dec 2022 (70 comments)

gpjanik3y ago

This is simultaneously an epic clickbait and a very accurate represenatation of reality that is very boring and not shocking at all. Congrats to whoever wrote the line.

nottorp3y ago

Brilliant title for the article.

Even though I'm a paid github customer, I had no idea they had a program called "secret scanning" and that it's actually beneficial.

So I obviously assumed they're letting China scan my private repos.

They really need to work on wording.

jasode3y ago

>, I had no idea they had a program called "secret scanning" and that it's actually beneficial.

Fyi... this feature was also previously mentioned in the news for public repos: https://techcrunch.com/2022/12/15/github-brings-free-secret-...

>So I obviously assumed they're letting China scan my private repos.

To clarify, it's Microsoft/Github doing the scanning of private repos on behalf of the partners. They're just forwarding the tokens that match the partners' regexp.

nottorp3y ago

Yeah I read the article and the comments on HN so I know what it's about now. I still think they (not HN) should change the title to include what secret scanning means.

Edit: how about dropping the corporatese and title it "github will now scan public repos for secret WeChat tokens"?

pixl973y ago

Yea, but you don't get panic clickthru with this message.

1 more reply

nottorp3y ago

Actually there's an even better term below in the thread. Use just "for WeChat credentials". No mention of secret scanning.

justaka3y ago

This is 100x better than the original!

iampivot3y ago

Like .* ?

jasode3y ago

>Like .* ?

Assuming your question is not a joke...

The partner has to email the regex to secret-scanning@github.com for their approval. See the steps at: https://docs.github.com/en/developers/overview/secret-scanni...

Once it's in the scanning system, the partner receives JSON messages alerts such as:

  [
    {
      "token":"NMIfyYncKcRALEXAMPLE",
      "type":"mycompany_api_token",
      "url":"https://github.com/octocat/Hello-World/blob/12345600b9cbe38a219f39a9941c9319b600c002/foo/bar.txt",
      "source":"content"
    } 
  ]

So instead of ""token":"NMIfyYncKcRALEXAMPLE"," -- the private repo owners would worry about '.*' regex leaking full source code instead of API credentials such as ""token":"#include <stdio.h>\nmain(){\nprintf("hello world");\n}","

The above scenario requires believing the following:

- Microsoft/Github is technically incompetent and an employee and/or their internal regex sanity checking tool will blindly accept open-ended regex like '.*'

- MS/Github will then allow that unbounded regex to leak petabytes of private source code out to China partners via the JSON "token:" response. (Github says they have 18+ petabytes of data and most of that is private repos: https://twitter.com/github/status/1569852682239623173)

If one believes their entire private repo source code is at risk of being copied to TenCent being leaked by the '.*' threat because the above scenario seems realistic, I assume the answer is to delete the repo.

1 more reply

tonmoy3y ago

I don’t think GitHub will send back the matching string, just the name of the repos

2 more replies

barbariangrunge3y ago

Devils advocate: I read recently that GitHub is being used to circumvent censorship in China. Does this system of allowing them to provide regexes allow China to automatically obtain lists of users who are mentioning certain words or phrases? Or is that nonsense?

jopsen3y ago

> Or is that nonsense?

Yes, that is nonsense.

1) secret scanning can be disabled (not even sure it's enabled by default). 2) the regexes are fairly specific, length limited, etc. 3) github is obviously reviewing regexes that are accepted.

Check the list of stuff supported: https://docs.github.com/en/code-security/secret-scanning/sec...

A bit sad, they don't publish the list of regexes, etc.

--------------

I added a similar thing to the package manager for Dart / Flutter, because we saw users accidentally publishing secrets. That code is public, it relies on regexes and entropy estimation:

https://github.com/dart-lang/pub/blob/eb8ee21a089ebe0f2c2dd8...

It was heavily inspired by the researchers in: https://www.ndss-symposium.org/wp-content/uploads/2019/02/nd...

Worth a read, and certainly provides motivation for Github to do this kind of work :D

(disclosure: I work for Google. The opinions stated here are my own)

1 more reply

voakbasda3y ago

I had the same reaction. This seems like the plan of scanning of pictures on iPhones for CSAM; it would not be hard to add extra patterns that match materials beyond the original intent.

Are the secret patterns all publicly available? Or is the secret scanning patterns themselves secret? Without public review, we cannot know what secrets they will obtain.

I for one do not trust GutHub/Microsoft to act in the interest of the average user. Their past actions disqualify them from receiving any benefit of doubt.

rob743y ago

Yeah, you have to read the article to realize that "secret" is a noun in this case, not an adjective...

stavros3y ago

Seems unbelievable how they'd fumble the title when they could just as easily have called it "secrets scanning" and it would be OK.

Waterluvian3y ago

Or maybe “secret detection”

Though perhaps that’s just my own bias on the subtle differences in the meanings of those words.

crazyedgar3y ago

Maybe it's intentional, to generate traffic.

Cthulhu_3y ago

I think the name of the service is a bit ambiguous; they could've called it "Access Key Scanning" or even just "Secret*s* Scanning". Even capitalizing it would set it apart as a service instead of regular words in a sentence.

pcthrowaway3y ago

Credential scanning.

It's not scanning that they're doing in secret. Credential scanning removes the ambiguity

civopsec3y ago

Scanning repos for secrets has been a thing for a while now. But seeing Tencent might put people on edge.

atonse3y ago

Secret scanning is a thing.

But this is an excellent next step where they build an integration with these partners where, as soon as a secret is scanned, they can notify tencent/AWS/other providers automatically to instantly invalidate those keys before they’re abused.

That’s what’s novel here.

1 more reply

szundi3y ago

China doesn’t even have to scan now, Github is going to sort it out and send it all for us. Sounds bad.

toastal3y ago

There's never been a better time to migrate your projects away from corporate control

gumboza3y ago

Actually the best time expired several years ago. Also prevention is better than cure.

1 more reply

netsharc3y ago

Hmm, 2022 (2023 soon!) and people are jumping and screaming on their first and incorrect reaction to some headline (not you, but scroll down for a comment doing just that). God Bless The Internet! /s

trompetenaccoun3y ago

To everyone portraying this as harmless and as Wechat just looking for security breaches: Tencent itself is the security breach. Not only can Chinese ppl not sign up without providing a phone number, just to get a SIM card they now take your government ID, a picture of your face and a fingerprint! Xi is making absolutely sure that every single internet user is IDed and has their conversations tracked on apps like Wechat. Whatsapp, Signal & co are banned.

These "leaked" secrets GitHub forwards might be dissidents getting access without being tracked. It might not be a WeChat secret at all who knows? They're not a trustworthy partner, nothing should be shared with this company.

And to the folks saying it's public information and they already have it: That makes no sense, then they don't need GitHubs help. Obviously GitHub is supporting their scanning efforts here.

oefrha3y ago

> And to the folks saying it's public information and they already have it: That makes no sense, then they don't need GitHubs help.

GitHub has a global stream API for all public events,[1] but it is delayed by five minutes, precisely so that sensitive actions like revoking leaked tokens can be performed before the world sees them. That’s what the secret scanning program is about, and you would have known if you spent 1/3 of the time of your rant learning about it.

Edit: Additionally, for private repos, secret scanning is opt-in and only alerts owners.

[1] https://docs.github.com/en/rest/activity/events?apiVersion=2...

anaganisk3y ago

Wait a second, the requirement of a government to get a sim card is kinda standard practice in multiple countries. Also, when it comes to privacy, US based companies must be last ones to talk, like as if China is the only bad guy who infringes upon peoples right to privacy. China is dangerous, but it's not the only dangerous thing in the room. Also, your comment doesn't make sense. If you are committing your public credentials while diseenting against the government, you are doing it wrong. Also, any publicly committed credentials are like literally tracked by thousands of both within minutes. Its not like if China really want to scan them, they can't do it without Github telling them they found something.

trompetenaccoun3y ago

You may have misunderstood. There is no way to anonymously access Weixin from China unless you have hacked credentials. You need a phone number. Note that local Weixin and foreign Wechat are not the same. Last time my Mainland friend bought a SIM card the vendor had a government app on his phone, snapped a picture of my friend's face, scanned the ID (身份证) and had him take a fingerprint with a reader he also had connected to his phone. All this data gets uploaded directly to the Chinese government.

There isn't a country in the world which does this. But the details are also not the main point, it's how extremely restricted and controlled simple access to information or forums of free expression is for people in China. Tencent has party officials working within the company. This isn't a regular business as Westerners might imagine it, it's an extended part of the CCP just like any other large corporation under Xi.

Again, people are saying it's no big deal but why would GitHub help them at all? It's not a good cause.

anaganisk3y ago

Github here isn't supporting China govt, they're partnering with companies that want to provide a regex to their credentials. And I dont know where you hail from but, Im from India and I have a govt issued mandatory id card that has multiple biometrics and my photo associated to it. And to get a sim card I need to provide that ID and authenticate with my fingeprints. Also tell me which US companies isn't drooling over China contracts, and to an extension orther local hostile activites. There is literally a recent story where using facial recognition Madison square garden denied entry to an attorney, who was related to a company that is in litigation with its parent company. Buy yeah China Bad.

snehk3y ago

> There isn't a country in the world which does this.

My government requires me to have ID, which contains a photo and finger prints and you cannot get a SIM without ID. That's Germany and it's true for many, many countries.

pontilanda3y ago

> There isn't a country in the world which does this

Does what? The thing extra is the fingerprint but literally every modern country requires ID registration and more. My government also knows this IP belongs exactly to me. Stop spouting nonsense.

Plus this is completely unrelated.

jinzo3y ago

This is offtopic (as it has nothing to do with the linked blogpost) but it's even worse. At the tail end of 2019 I went to China for a few weeks, I created a WeChat account at home without any problems. As soon as I stepped into China it got locked and I needed someone with a WeChat account to verify me. They can only verify (I think?) 3 new accounts per year, and 6 accounts that got locked out for whatever reason. This is (from my perspective) even worse than requiring ID for SIM etc. It links people together and I'm sure it brings some repercussions to the people that verified you if you make trouble down the line.

It was very fascinating to see, a near total domination of WeChat everywhere and relatively very hard onboarding for new accounts. Contrary to the west where most of services seek to streamline onboarding as much as possible - I guess that becomes an anti feature when you have total monopoly and _everyone_ has a WeChat account. I think it's a very effective (and very dystopian) form of control. P.S: Signal worked without any problems for me, even on a Chinese SIM (one "trick" to go around most of the GFW was buying a HK SIM in HK. Works across china and has a lot less blocks, but for various reasons I got a China SIM too).

aeyes3y ago

This is a service running on public repos, anyone can scrape this which is the problem. GitHub does the scanning and all that is forwarded is the "secret" matching their regex. Tencent then identifies the account owner and informs them about the public secret. That's all.

GitHub is available in China, why shouldn't they protect their Chinese users?

And the SIM card requirements have nothing to do with Tencent, have you tried getting a SIM in Germany? Impossible without government ID and an address. And there are a lot of services which you can't sign up for without German ID / address. As a foreigner I also can't easily open a bank account in the US.

barbariangrunge3y ago

Why do they notify tencent instead of the repo owner?

BigGreenTurtle3y ago

Once a co-worker accidentally pushed an AWS key pair to his public dotfiles repo. About 30 seconds later AWS disabled the key and notified the account admin about the possibility of an account breach.

vxNsr3y ago

This is my question too… why not just let the owner of the repo know, why notify Tencent at all?

1 more reply

sulam3y ago

Without taking away from your first paragraph at all, if any dissidents are publishing their access codes to GitHub repos, they are 1) doing it completely wrong and 2) are already screwed.

The threat here, in the worst case, is associating a GitHub ID with a WeChat ID.

vxNsr3y ago

Quoted from the blog post:

> We have partnered with Tencent WeChat to scan for their tokens and help secure our mutual users on all public repositories and private repositories with GitHub Advanced Security.

This is GitHub scanning private repos and telling WeChat about them.

WeChat can already scan public repos.

They are not already screwed if they’re publishing something to a private repo, it might be the wrong way to do it, but it doesn’t mean they’re already screwed.

If you don’t trust GitHub’s private repo security then why are you using it in the first place?

aeyes3y ago

Wrong, this only applies to public repos.

https://docs.github.com/en/code-security/secret-scanning/abo...

1 more reply

civopsec3y ago

Imagine if someone protested against Finnish–Russian cooperation on search-and-rescue operations near their border because the evil Russian government could be searching for political dissidents to imprison. That’s what your comment sounds like.

This is about preventing things like API keys from being published to code. That’s not a dissident use-case…

quadrifoliate3y ago

While Tencent and Wechat sound absolutely dystopian, the "you need a Government ID and a picture of your face" is often a requirement for creating a Facebook account or retaining your old one as well. Twitter also used to require a phone number to retain an active account; and Google frequently locks people out of old accounts unless they provide a phone.

Is this whataboutism? Possibly – but what I'd actually like to happen is US-based companies are charged company-hurting fines for mismanaging PII like this (Twitter, for example, is currently openly planning to sell user phone data [1] that they previously gathered for security purposes).

All this to say, we can't reasonably call out other dystopian companies if the ones we use everyday are doing the exact same thing. So we should call out secret scanning from Meta [2] and (if it ever happens) Twitter as well.

----------------------------------------

[1] https://www.businessinsider.com/twitter-plans-to-force-users...

[2] https://developers.facebook.com/blog/post/2021/11/09/meta-jo...

derefr3y ago

> These "leaked" secrets GitHub forwards might be dissidents getting access without being tracked.

"Leaked" here means "made public", i.e. "published such that literally anyone can use them", for example when burned into a commit of a public repo. Even for a dissident, publishing an API key or other credential where literally anyone can find it to use it, is almost assuredly a mistake. Because external scrapers can also find it there, such that the key will be inevitably picked up and fed into a botnet to abuse — at which point the ops staff at the service will notice the abuse and revoke the key, thus "burning" it as useful from the dissident's perspective.

If you store a secret on Github somewhere that only people and people you trust have access to, rather than everyone having access to it, then this is not considered a "leak", and so Github does not detect this as a "leaked secret." For example, commit data of private repos is not scanned for secrets (if it was, GitOps as a concept would be impossible!); nor are a repo's formal Actions Secrets store (part of a repo's configuration readable only by triggered Github Actions CI jobs).

Github's own secret-scanning here, is trying to catch the cases where a user has done something stupid by accident. Whether or not they reported secrets to third parties, they'd still be doing leaked-secret scanning of their own Github API keys, to ensure that people aren't accidentally trying to configure Github Actions by burning their Github Actions CI API key into the workflow itself. If they find such keys, they revoke them.

The point of Github's secret-scanning partner program, is that because Github is doing this leaked-secret scanning for their own purposes anyway, you (the partner) can sign up to be told when API keys of yours are accidentally made public as well.

> That makes no sense, then they don't need GitHubs help.

Ignoring for a moment that Github is a website, and so anyone can just crawl it—

Did you know? Github pushes the commit data of all public repos to BigQuery as a public research dataset: https://codelabs.developers.google.com/codelabs/bigquery-git.... Literally anyone can do their own "secret scanning" with a simple BigQuery query. It costs about $500 to run such a query, because the Github dataset is pretty large. It's not a price most SMEs would pay. But it's definitely a price attackers could be willing willing to pay. It's a lot cheaper than running your own web-spider infrastructure!

The difference with Github's own secret scanning, is that it happens synchronously, on push of commits; whereas the ETL of commit data to Github et al happens asynchronously, some time after commits happen. Tencent — and every other secret-scanning partner — depends on Github to stay ahead of any third-party attackers trying to scrape leaked credentials for use in botnets et al.

Also, FYI, you yourself can sign up to be a Github secret-scanning partner. You just need 1. a regex that uniquely identifies your secrets, so that Github can recognize them on push, and 2. a webhook URL to report them to. (https://docs.github.com/en/developers/overview/secret-scanni...)

And by the way, this isn't a hypothetical nice-to-have. I run an API SaaS — and not one that's even very large, in relative terms. But my own customers' accidentally-leaked secrets have been scraped from their Github repos and used by botnets already! Signing up as a Github secret-scanning partner is on my to-do list.

andrewaylett3y ago

This is part of https://docs.github.com/en/developers/overview/secret-scanni...

It lets WeChat revoke tokens that GitHub finds in public repositories.

mmaunder3y ago

It lets WeChat see tokens that GitHub forwards to them. What they do with it is up to them, but the intent is that they mitigate the issue.

“GitHub will forward access tokens found in public repositories to Tencent WeChat, who will notify affected users.”

vxNsr3y ago

Why did you edit the full quote?

Here’s what I just copied from the blog post without modification:

> We have partnered with Tencent WeChat to scan for their tokens and help secure our mutual users on all public repositories and private repositories with GitHub Advanced Security.

It’s not just public repos, it’s private repos too.

2 more replies

2553y ago

Optics of this article could be improved.

However, this is already a well established and useful thing. When you publish your AWS (for example) secrets to your public repo, it will scan it and stop it leaking before damage can be done. This is just the same for another service.

civopsec3y ago

Why could the optics be improved?

redleader553y ago

It would be nice of Github if they could publish a transparency repo with all the partners and all the regex along with this initiative. I see a lot of people in this thread worried that "China gets their data" and this transparency repo could alleviate some of that.

boredhedgehog3y ago

Why do people worry about China so much? There is barely any cooperation between Chinese intelligence and the rest of the world.

If I was forced to pick one government to share my secrets with, it would be the Chinese, because there's nothing they can do about it. My own government and its allies is infinitely more dangerous to me than such a foreign one.

mylidlpony3y ago

> because there's nothing they can do about it

Are you talking about the China that bought huge areas in ports around the world? The same one that has secret police stations as well?

2 more replies

Alifatisk3y ago

> If I was forced to pick one government to share my secrets with, it would be the Chinese, because there's nothing they can do about it.

Unless you live or visit there. Wasn’t there reports of China having concentration camps?

ilyt3y ago

Obviously if someone would pick China it was because they were not planning to visit there

richardw3y ago

Because if you want to influence the voting patterns of a population, knowing as much as you can is useful. Search history + TikTok + FB would give you unbeatable datasets that you could use for the lifetime of the person. Take a dataset now, add a decade of AI progress and I’m pretty sure a nefarious actor would be leading many people around by the nose. Not all, but 5% would move many elections.

They wouldn’t even need to learn all that much about you as an individual. Just enough to match you with a cluster from their own population that they have infinite data on.

1 more reply

didericis3y ago

> If I was forced to pick one government to share my secrets with, it would be the Chinese, because there’s nothing they can do about it.

What makes you so sure about that?

I worry about China because there’s no internal checks to prevent them from doing anything.

Western governments and allies have a long culture of court systems and thinking about balancing constituent needs. That is eroding and becoming more dangerous to the extent western leaders are envious of dictatorial powers and trying to emulate Chinese totalitarianism, but there is a lot of institutional and cultural bulwark against it.

Any powerful totalitarian country should worry people. People underestimate the level of covert aggression in all facets of foreign involvement in regimes with no internal accountability.

2 more replies

dwighttk3y ago

> because there's nothing they can do about it

What makes you think this?

cscurmudgeon3y ago

https://amp.cnn.com/cnn/2022/12/04/world/china-overseas-poli...

Yep totally harmless.

> China operating over 100 police stations across the world with the help of some host nations, report claims

londons_explore3y ago

I agree. For most people here, it is their own government (or allied governments) which are the ones to be most cautious about. They're the ones most likely to ruin your life if they don't agree with what you're up to - elected or not.

dncornholio3y ago

Because American propoganda.

eynsham3y ago

Some people are Chinese.

talkingtab3y ago

You may not have noticed this, but China is seeking to extend it's influence. By whatever means required including deadly force. There is a picture you might want to look at. Search "Tank Man". That is why people worry about China so much.

1 more reply

tomudding3y ago

There is a list of partners [0], I thought I had seen regexes at some point but I can no longer find them.

[0]: https://docs.github.com/en/enterprise-cloud@latest/code-secu...

ffpip3y ago

I don't think they would release the regex used to validate the API keys since it would help people automate scanning for API keys of all supported providers on public repos on any other site using the regex given by the provider itself.

civopsec3y ago

Is this blog post not the transparency part?

I assume that the regex is `TC:[a-z0-9]{20}` or something uninteresting like that.

SalimoS3y ago

I guess because some lazy people will use those regex in all other public git hosts and search engine

nomercy4003y ago

Wait, what?

So any string (which Github deems an access token) is forwarded to Tencent?

Or will Tencent share all their current access tokens with github?

AlbertVAustin3y ago

Any string that matches access token regexp provided by Tencent (see https://docs.github.com/en/developers/overview/secret-scanni...).

Pathogen-David3y ago

For public repositories only though. For private repos it's optional, and when enabled the repo admins get an alert to handle it themselves without it going to the vendor.

plugin-baby3y ago

.*

;-)

ilyt3y ago

So it is just one bad regexp away from sending them other companies secrets

1 more reply

kredd3y ago

You can already do the former by using GitHub Events API. This simply helps with the accidental leak of tokens into the public, so Tencent / Repo owner can revoke it before it gets abused. https://docs.github.com/en/rest/activity/events?apiVersion=2...

luc_3y ago

They had to have titled it like this on purpose. I almost spat out my tea.

gbtw3y ago

What does a wechat token look like, as in can i scan my repo to see if i do not leak anything unwanted to wechat?

That said, could one also generate tokens and essentially DDOS the wechat org by having them inform their customers unnecessarily?

galuggus3y ago

Isn't this information already public?

pixl973y ago

Which information?

Your wechat tokens, no that should never be public, hence why this feature exist?

That github reports that you leaked your wechat tokens, it was announced just recently, hence the post.

That github is giving wechat your secrets, not that is not what this is about although the article title would make you think that.

hunter2_3y ago

GitHub is indeed giving WeChat our information, but only when it looks just like WeChat secrets, and only once it's already publicly leaked (any leaked via private repo instead goes to the repo admin).

So technically the answer to GP is 'yes'.

b4je7d7wb3y ago

I dont think you can claim an API key is your information. It is quite by definition information created by WeChat and Github is sharing only that with them a few minutes before it shares it with bad actors.

1 more reply

nintendo18893y ago

They should scan for Bitcoin seeds too.

Alifatisk3y ago

https://github.com/eth0izzle/shhgit

olksdhdkdbdj3y ago

1. If their regex matches my company token, will it be send to them? 2. Can Wechat update the token regex to collect tokens from competitor company? 3. Can Tencent collect information about applications that use wechat?

Traubenfuchs3y ago

Why is everyone upset? This is a good thing.

Where are you seeing a privacy or security risk?

pontilanda3y ago

It’s a combination of missing hyphens (it should be “secret-scanning partners” to avoid adjective ambiguity) and people’s inability to open links and read anything past the title. Sprinkle a bit of Sinophobia and we’re golden.

jhugo3y ago

Tencent provides a list of regexps, and anything matching those regexps is passed to them. As far as I can tell, we don't get to know what those regexps are (and presumably they can be changed at Tencent's whim). Can you not see the issue?

Traubenfuchs3y ago

We do not know if Tencent can use arbitrary regex to find, I don't know, anti-Chinese sentiment content or just preapproved ones like "tencentToken=([a-zA-Z\d]{15})". Also, it's just for public repositories!

In any case, this announcement changes nothing. If you trusted GitHub with something before that you wouldn't trust them with now, your mental model is wrong. GitHub might allow any kind of partner (customer?) to scan their private or public repos in any way they want without making it public. In other words, if you are someone this announcement is problematic to, you shouldn't have anything on GitHub in the first place.

pontilanda3y ago

I cannot see the issue because the regex are pre-approved by GitHub. And even then, the service will only return the string, not who wrote it. Unless GitHub approves /Jonh Doe said:.*/ there is no issue whatsoever.

MonkeyClub3y ago

> I cannot see the issue because the regex are pre-approved by GitHub.

GitHub is a private company with one dual obligation, to prolong its existence and keep increasing its profit margin.

It is not any sort of arbiter for morality - morality being an externality to its central obligation - so it cannot be relief upon to “do the right thing”.

So it is not in any position of authority that would enable it to “approve”, in the moral sense of the word. They can only “allow” for the regex to be ran and the results sent off.

For example, the “right thing” for GH would be to increase profit, while for another entity might instead be to uphold its users’ privacy.

(You may think that it’s only for public repos, so they’re already made public, but isn’t GH here facilitating an aggressive collection and summation of information, that would otherwise be much more difficult and error-prone?)

The power of approval would rather come from an elected entity that would also determine who may request that such searches are executed, and which reasons would be valid.

Otherwise, we get a William Gibson-esque megacorp cyberspace future with clear but corporate Orwellian overtones.

Isn’t this obvious?

(I’m not being snarky at all - I’m genuinely asking: isn’t this glaringly and terrifyingly obvious?)

1 more reply

jhugo3y ago

I guess I just have a lot less faith in the ability of companies to design perfect processes, and the ability of humans to perfectly carry them out, than you do.

ilyt3y ago

Yeah because it is oh-so-easy to ensure your regexp matches only your company's tokens and not 10000 other companies tokens /s

1 more reply

wkat42423y ago

> Where are you seeing a privacy or security risk?

Well...

> Tencent

Here.

It's really the combination Tencent and Partnership that I find a problem. These things tend to lead to closer collaboration and WeChat is a huge surveillance tool.

Sure they have access to public info anyway because everyone does. Just let them scan it themselves then.

And yes I'd feel almost the same if it was Facebook.

whoevercares3y ago

It’s absolutely shocking to observe how hostile HN is to Chinese affairs. While in real life many must have collaborated A LOT with Chinese engineers & managers. Are you worry about bias bleeding into real life? I’m indeed worried as a Chinese immigrant working in tech

b4je7d7wb3y ago

It's mostly because of the government. Don't take it personally. I have quite strong opinions of China, but I don't let it influence my relationships with my chinese collegues.

masterof03y ago

Is simply xenophobia, as it is with Russians or Russia-related issues. This is how the internet works in the Western world. When you point it out, they simply downvote you; if your account is new, they will claim it is a bot or that you are an agent/collaborator, and so on.

munhitsu3y ago

Just make sure your secret doesn’t look like a WeChat secret

okokwhatever3y ago

Is this a joke? I'm not laughing.

Arnt3y ago

No joke. «GitHub will forward access tokens found in public repositories to Tencent WeChat, who will notify affected users.» In other words, Tencent now has access to all of your public repositories.

Also, Github now has code recognise Tencent access tokens.

jkaplowitz3y ago

> In other words, Tencent now has access to all of your public repositories.

They already did. That's what public means. This is just an optimization to make it harder for WeChat access tokens to be inadvertently compromised without getting noticed.

If you're worried about the Chinese government having inappropriate influence over or access to various things outside China, that's in general a valid concern indeed, but facilitating credential scanning in public repositories really doesn't seem worrying.

Arnt3y ago

I'm shocked by the number of respondents who felt the need to point out what public means.

1 more reply

andrewaylett3y ago

Tencent already had access to all your public repositories? They're public.

blitzar3y ago

Thats one thing, but its not like they have access to my public instagram photos, tweets or anything like that (/s?)

_jplc3y ago

> We have partnered with Tencent WeChat to scan for *THEIR* tokens

1 more reply

lopkeny12ko3y ago

Just a reminder that Git is a decentralized protocol and Github is merely a (poor) implementation of it. Microsoft-Github have been increasingly introducing antifeatures, just one of which is sending repository contents to China automatically.

For the last few years I've been running Git off my own servers with a cgit [0] frontend, and couldn't be happier.

[0] https://git.zx2c4.com/cgit/about/

bswinnerton3y ago

Repository contents aren’t “sent” to China, companies like Tencent specify the shape of tokens to GitHub and GitHub does the scanning and then notifies Tencent to revoke the token if one is found.

Alifatisk3y ago

How is Githun a poor implementation of Git? Because it’s centralized?

boomboomsubban3y ago

I believe it's roughly a quote from Linus Torvalds, the creator of git who has many issues with githubs decisions. See https://www.wired.com/2012/05/torvalds-github/ for a start, his opinion hasn't improved over the decade.

rightbyte3y ago

His oppositions seems to be nitpicking and he says it is fine for hosting?

1 more reply

j / k navigate · click thread line to collapse

141 comments

dang3y ago

Lame corporate partnership announcements aren't on topic for HN, and the wording here looks to have been a boilerplate malfunction: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que....

Poor functionary creates political incident with humble template...sounds like a Gogol short story. "but it worked great for redirect.pizza!"

Btw I assume this recent thread was about the same feature:

Secret scanning is now available for free on public repositories - https://news.ycombinator.com/item?id=34007637 - Dec 2022 (70 comments)

gpjanik3y ago

This is simultaneously an epic clickbait and a very accurate represenatation of reality that is very boring and not shocking at all. Congrats to whoever wrote the line.

nottorp3y ago

Brilliant title for the article.

Even though I'm a paid github customer, I had no idea they had a program called "secret scanning" and that it's actually beneficial.

So I obviously assumed they're letting China scan my private repos.

They really need to work on wording.

jasode3y ago

>, I had no idea they had a program called "secret scanning" and that it's actually beneficial.

Fyi... this feature was also previously mentioned in the news for public repos: https://techcrunch.com/2022/12/15/github-brings-free-secret-...

>So I obviously assumed they're letting China scan my private repos.

To clarify, it's Microsoft/Github doing the scanning of private repos on behalf of the partners. They're just forwarding the tokens that match the partners' regexp.

nottorp3y ago

Yeah I read the article and the comments on HN so I know what it's about now. I still think they (not HN) should change the title to include what secret scanning means.

Edit: how about dropping the corporatese and title it "github will now scan public repos for secret WeChat tokens"?

pixl973y ago

Yea, but you don't get panic clickthru with this message.

1 more reply

nottorp3y ago

Actually there's an even better term below in the thread. Use just "for WeChat credentials". No mention of secret scanning.

justaka3y ago

This is 100x better than the original!

iampivot3y ago

Like .* ?

jasode3y ago

>Like .* ?

Assuming your question is not a joke...

The partner has to email the regex to secret-scanning@github.com for their approval. See the steps at: https://docs.github.com/en/developers/overview/secret-scanni...

Once it's in the scanning system, the partner receives JSON messages alerts such as:

  [
    {
      "token":"NMIfyYncKcRALEXAMPLE",
      "type":"mycompany_api_token",
      "url":"https://github.com/octocat/Hello-World/blob/12345600b9cbe38a219f39a9941c9319b600c002/foo/bar.txt",
      "source":"content"
    } 
  ]

The above scenario requires believing the following:

- Microsoft/Github is technically incompetent and an employee and/or their internal regex sanity checking tool will blindly accept open-ended regex like '.*'

1 more reply

tonmoy3y ago

I don’t think GitHub will send back the matching string, just the name of the repos

2 more replies

barbariangrunge3y ago

jopsen3y ago

> Or is that nonsense?

Yes, that is nonsense.

1) secret scanning can be disabled (not even sure it's enabled by default). 2) the regexes are fairly specific, length limited, etc. 3) github is obviously reviewing regexes that are accepted.

Check the list of stuff supported: https://docs.github.com/en/code-security/secret-scanning/sec...

A bit sad, they don't publish the list of regexes, etc.

--------------

I added a similar thing to the package manager for Dart / Flutter, because we saw users accidentally publishing secrets. That code is public, it relies on regexes and entropy estimation:

https://github.com/dart-lang/pub/blob/eb8ee21a089ebe0f2c2dd8...

It was heavily inspired by the researchers in: https://www.ndss-symposium.org/wp-content/uploads/2019/02/nd...

Worth a read, and certainly provides motivation for Github to do this kind of work :D

(disclosure: I work for Google. The opinions stated here are my own)

1 more reply

voakbasda3y ago

I had the same reaction. This seems like the plan of scanning of pictures on iPhones for CSAM; it would not be hard to add extra patterns that match materials beyond the original intent.

Are the secret patterns all publicly available? Or is the secret scanning patterns themselves secret? Without public review, we cannot know what secrets they will obtain.

I for one do not trust GutHub/Microsoft to act in the interest of the average user. Their past actions disqualify them from receiving any benefit of doubt.

rob743y ago

Yeah, you have to read the article to realize that "secret" is a noun in this case, not an adjective...

stavros3y ago

Seems unbelievable how they'd fumble the title when they could just as easily have called it "secrets scanning" and it would be OK.

Waterluvian3y ago

Or maybe “secret detection”

Though perhaps that’s just my own bias on the subtle differences in the meanings of those words.

crazyedgar3y ago

Maybe it's intentional, to generate traffic.

Cthulhu_3y ago

pcthrowaway3y ago

Credential scanning.

It's not scanning that they're doing in secret. Credential scanning removes the ambiguity

civopsec3y ago

Scanning repos for secrets has been a thing for a while now. But seeing Tencent might put people on edge.

atonse3y ago

Secret scanning is a thing.

That’s what’s novel here.

1 more reply

szundi3y ago

China doesn’t even have to scan now, Github is going to sort it out and send it all for us. Sounds bad.

toastal3y ago

There's never been a better time to migrate your projects away from corporate control

gumboza3y ago

Actually the best time expired several years ago. Also prevention is better than cure.

1 more reply

netsharc3y ago

Hmm, 2022 (2023 soon!) and people are jumping and screaming on their first and incorrect reaction to some headline (not you, but scroll down for a comment doing just that). God Bless The Internet! /s

trompetenaccoun3y ago

And to the folks saying it's public information and they already have it: That makes no sense, then they don't need GitHubs help. Obviously GitHub is supporting their scanning efforts here.

oefrha3y ago

> And to the folks saying it's public information and they already have it: That makes no sense, then they don't need GitHubs help.

Edit: Additionally, for private repos, secret scanning is opt-in and only alerts owners.

[1] https://docs.github.com/en/rest/activity/events?apiVersion=2...

anaganisk3y ago

trompetenaccoun3y ago

Again, people are saying it's no big deal but why would GitHub help them at all? It's not a good cause.

anaganisk3y ago

snehk3y ago

> There isn't a country in the world which does this.

My government requires me to have ID, which contains a photo and finger prints and you cannot get a SIM without ID. That's Germany and it's true for many, many countries.

pontilanda3y ago

> There isn't a country in the world which does this

Does what? The thing extra is the fingerprint but literally every modern country requires ID registration and more. My government also knows this IP belongs exactly to me. Stop spouting nonsense.

Plus this is completely unrelated.

jinzo3y ago

aeyes3y ago

GitHub is available in China, why shouldn't they protect their Chinese users?

barbariangrunge3y ago

Why do they notify tencent instead of the repo owner?

BigGreenTurtle3y ago

vxNsr3y ago

This is my question too… why not just let the owner of the repo know, why notify Tencent at all?

1 more reply

sulam3y ago

Without taking away from your first paragraph at all, if any dissidents are publishing their access codes to GitHub repos, they are 1) doing it completely wrong and 2) are already screwed.

The threat here, in the worst case, is associating a GitHub ID with a WeChat ID.

vxNsr3y ago

Quoted from the blog post:

> We have partnered with Tencent WeChat to scan for their tokens and help secure our mutual users on all public repositories and private repositories with GitHub Advanced Security.

This is GitHub scanning private repos and telling WeChat about them.

WeChat can already scan public repos.

They are not already screwed if they’re publishing something to a private repo, it might be the wrong way to do it, but it doesn’t mean they’re already screwed.

If you don’t trust GitHub’s private repo security then why are you using it in the first place?

aeyes3y ago

Wrong, this only applies to public repos.

https://docs.github.com/en/code-security/secret-scanning/abo...

1 more reply

civopsec3y ago

This is about preventing things like API keys from being published to code. That’s not a dissident use-case…

quadrifoliate3y ago

----------------------------------------

[1] https://www.businessinsider.com/twitter-plans-to-force-users...

[2] https://developers.facebook.com/blog/post/2021/11/09/meta-jo...

derefr3y ago

> These "leaked" secrets GitHub forwards might be dissidents getting access without being tracked.

> That makes no sense, then they don't need GitHubs help.

Ignoring for a moment that Github is a website, and so anyone can just crawl it—

andrewaylett3y ago

This is part of https://docs.github.com/en/developers/overview/secret-scanni...

It lets WeChat revoke tokens that GitHub finds in public repositories.

mmaunder3y ago

It lets WeChat see tokens that GitHub forwards to them. What they do with it is up to them, but the intent is that they mitigate the issue.

“GitHub will forward access tokens found in public repositories to Tencent WeChat, who will notify affected users.”

vxNsr3y ago

Why did you edit the full quote?

Here’s what I just copied from the blog post without modification:

> We have partnered with Tencent WeChat to scan for their tokens and help secure our mutual users on all public repositories and private repositories with GitHub Advanced Security.

It’s not just public repos, it’s private repos too.

2 more replies

2553y ago

Optics of this article could be improved.

civopsec3y ago

Why could the optics be improved?

redleader553y ago

boredhedgehog3y ago

Why do people worry about China so much? There is barely any cooperation between Chinese intelligence and the rest of the world.

mylidlpony3y ago

> because there's nothing they can do about it

Are you talking about the China that bought huge areas in ports around the world? The same one that has secret police stations as well?

2 more replies

Alifatisk3y ago

> If I was forced to pick one government to share my secrets with, it would be the Chinese, because there's nothing they can do about it.

Unless you live or visit there. Wasn’t there reports of China having concentration camps?

ilyt3y ago

Obviously if someone would pick China it was because they were not planning to visit there

richardw3y ago

They wouldn’t even need to learn all that much about you as an individual. Just enough to match you with a cluster from their own population that they have infinite data on.

1 more reply

didericis3y ago

> If I was forced to pick one government to share my secrets with, it would be the Chinese, because there’s nothing they can do about it.

What makes you so sure about that?

I worry about China because there’s no internal checks to prevent them from doing anything.

Any powerful totalitarian country should worry people. People underestimate the level of covert aggression in all facets of foreign involvement in regimes with no internal accountability.

2 more replies

dwighttk3y ago

> because there's nothing they can do about it

What makes you think this?

cscurmudgeon3y ago

https://amp.cnn.com/cnn/2022/12/04/world/china-overseas-poli...

Yep totally harmless.

> China operating over 100 police stations across the world with the help of some host nations, report claims

londons_explore3y ago

dncornholio3y ago

Because American propoganda.

eynsham3y ago

Some people are Chinese.

talkingtab3y ago

1 more reply

tomudding3y ago

There is a list of partners [0], I thought I had seen regexes at some point but I can no longer find them.

[0]: https://docs.github.com/en/enterprise-cloud@latest/code-secu...

ffpip3y ago

civopsec3y ago

Is this blog post not the transparency part?

I assume that the regex is `TC:[a-z0-9]{20}` or something uninteresting like that.

SalimoS3y ago

I guess because some lazy people will use those regex in all other public git hosts and search engine

nomercy4003y ago

Wait, what?

So any string (which Github deems an access token) is forwarded to Tencent?

Or will Tencent share all their current access tokens with github?

AlbertVAustin3y ago

Any string that matches access token regexp provided by Tencent (see https://docs.github.com/en/developers/overview/secret-scanni...).

Pathogen-David3y ago

For public repositories only though. For private repos it's optional, and when enabled the repo admins get an alert to handle it themselves without it going to the vendor.

plugin-baby3y ago

.*

;-)

ilyt3y ago

So it is just one bad regexp away from sending them other companies secrets

1 more reply

kredd3y ago

luc_3y ago

They had to have titled it like this on purpose. I almost spat out my tea.

gbtw3y ago

What does a wechat token look like, as in can i scan my repo to see if i do not leak anything unwanted to wechat?

That said, could one also generate tokens and essentially DDOS the wechat org by having them inform their customers unnecessarily?

galuggus3y ago

Isn't this information already public?

pixl973y ago

Which information?

Your wechat tokens, no that should never be public, hence why this feature exist?

That github reports that you leaked your wechat tokens, it was announced just recently, hence the post.

That github is giving wechat your secrets, not that is not what this is about although the article title would make you think that.

hunter2_3y ago

So technically the answer to GP is 'yes'.

b4je7d7wb3y ago

1 more reply

nintendo18893y ago

They should scan for Bitcoin seeds too.

Alifatisk3y ago

https://github.com/eth0izzle/shhgit

olksdhdkdbdj3y ago

Traubenfuchs3y ago

Why is everyone upset? This is a good thing.

Where are you seeing a privacy or security risk?

pontilanda3y ago

jhugo3y ago

Traubenfuchs3y ago

pontilanda3y ago

MonkeyClub3y ago

> I cannot see the issue because the regex are pre-approved by GitHub.

GitHub is a private company with one dual obligation, to prolong its existence and keep increasing its profit margin.

It is not any sort of arbiter for morality - morality being an externality to its central obligation - so it cannot be relief upon to “do the right thing”.

So it is not in any position of authority that would enable it to “approve”, in the moral sense of the word. They can only “allow” for the regex to be ran and the results sent off.

For example, the “right thing” for GH would be to increase profit, while for another entity might instead be to uphold its users’ privacy.

The power of approval would rather come from an elected entity that would also determine who may request that such searches are executed, and which reasons would be valid.

Otherwise, we get a William Gibson-esque megacorp cyberspace future with clear but corporate Orwellian overtones.

Isn’t this obvious?

(I’m not being snarky at all - I’m genuinely asking: isn’t this glaringly and terrifyingly obvious?)

1 more reply

jhugo3y ago

I guess I just have a lot less faith in the ability of companies to design perfect processes, and the ability of humans to perfectly carry them out, than you do.

ilyt3y ago

Yeah because it is oh-so-easy to ensure your regexp matches only your company's tokens and not 10000 other companies tokens /s

1 more reply

wkat42423y ago

> Where are you seeing a privacy or security risk?

Well...

> Tencent

Here.

It's really the combination Tencent and Partnership that I find a problem. These things tend to lead to closer collaboration and WeChat is a huge surveillance tool.

Sure they have access to public info anyway because everyone does. Just let them scan it themselves then.

And yes I'd feel almost the same if it was Facebook.

whoevercares3y ago

b4je7d7wb3y ago

It's mostly because of the government. Don't take it personally. I have quite strong opinions of China, but I don't let it influence my relationships with my chinese collegues.

masterof03y ago

munhitsu3y ago

Just make sure your secret doesn’t look like a WeChat secret

okokwhatever3y ago

Is this a joke? I'm not laughing.

Arnt3y ago

Also, Github now has code recognise Tencent access tokens.

jkaplowitz3y ago

> In other words, Tencent now has access to all of your public repositories.

They already did. That's what public means. This is just an optimization to make it harder for WeChat access tokens to be inadvertently compromised without getting noticed.

Arnt3y ago

I'm shocked by the number of respondents who felt the need to point out what public means.

1 more reply

andrewaylett3y ago

Tencent already had access to all your public repositories? They're public.

blitzar3y ago

Thats one thing, but its not like they have access to my public instagram photos, tweets or anything like that (/s?)

_jplc3y ago

> We have partnered with Tencent WeChat to scan for *THEIR* tokens

1 more reply

lopkeny12ko3y ago

For the last few years I've been running Git off my own servers with a cgit [0] frontend, and couldn't be happier.