Removing PGP from PyPI (opens in new tab)

alerighi3y ago

> Notably, PGP is incapable of providing either of these: you only get key IDs, which are neither strong human identities nor a strong binding to a service. Key IDs might correspond to keys with email (or other identities) in them, but that's (1) not guaranteed, and (2) not a strong proof of identity (since anybody can claim any identity in a PGP key).

Depends. If the distributor maintains a repository of trusted public keys (for example as repositories of Linux distributions do) it gives you a guarantee. As it's said, most of the time you just want to know that the key used to sign a package is not changed. That is the same level of security that SSH offers (first time you connect to a server saves the public key, then give an error if that public key is changed). That is really enough for a package in PyPy, or sign git commits and similar.

We should ask ourself if the complexity of PGP is needed. Probably not, as it's not needed the complexity of x509 certificates, since a simple RSA signature of the package with a public key hosted on a server would be sufficient. But PGP is practical, you have a good tooling built around it, is pretty universal, so why not?

hannob3y ago

What happens if the developer looses his key? Or if it expires?

pypi could show a warning that the key has changed. Which is not an actionable or helpful warning. And then everyone gets used to seeing these warnings every now and then. And you won nothing.

Getting signatures to do something useful is hard.

bombolo3y ago

> What happens if the developer looses his key? Or if it expires?

What happens if a developer loses their google titan key that is required to login into pypi?

hannob3y ago

They either have their backup codes or there's probably a manual process the pypi team can get them their account back if they can sufficiently show they are the real developers. If you have any form of automated signature verification you basically need a concept how to handle recovery. But if this concept comes down to "trust pypi", then you really can just skip the whole thing and rely on pypi giving you the right packages and https to secure the connection).

still_grokking3y ago

Did I just witness the invention of a kind of "software package blockchain"?

If would be btw. a proper but sustainable prove of work blockchain. As you would need in most cases to pay developers to "mint new blocks".

OK, maybe let's forget about the blockchain. It's a loaded term. But the idea of software signature TOFU sounds indeed good!

chaxor3y ago

Anyone that combats something based on the name alone isn't really worth listening to. There can be great use cases for Blockchain just like this, wherein the proof of work is less taxing, or optional. Of course, HN has a rabid response towards the term alone, but these technologies actually can provide some great solutions to a more robust form of git lfs, dockerhub, or huggingface model centralization that will inevitably fail at some point.

eduction3y ago· 13 in thread

So they examined everything uploaded to PyPi with a signature over three years, including old versions, and classify those packages whose signing key is expired today, possibly years later, as "impossible to meaningfully verify." Never mind that the package may have been verifiable with a valid key for a full year or two before the key expired, and in the meantime may have been superseded by a newer version.

They also say they can't "meaningfully verify" packages if the key does not have "binding identify information," by which they presumably mean automatically verifiable binding identity information, which usually means someone verified an email from keys.openpgp.org. This is a really narrow way to establish "binding identity information." For example someone who is a PyPi author and publicly links their PGP key from a (https) website on the same domain as the email on the key would not count. A well known longtime PyPi author with a well known key would not count.

The ad hoc, out of band nature of how PGP keys are trusted is not remotely new - PyPi would have known from the very start of adopting PGP that many keys would not be automatically verifiable. It makes little sense to turn around now and act like this is some surprising thing.

This has the smell of "we didn't want to bother supporting PGP any more because it's hard so we came up with an excuse."

No need for an excuse, though: Just be honest about it and let the chips fall where they may, if you really don't want to support PGP. God knows there are valid reasons for not having the energy to deal with PGP. (FWIW I think it's a good solution for packages, for those who can navigate the tooling, but on the other hand I'm not volunteering my time to run PyPi.)

P.S. There is a link in their post saying PGP has "documented issues." The specific issue described in the linked document is "packaging signing is not the holy grail" and a list of known things about PGP, like that verification of keys is ad hoc. It also concludes that there is no known better alternative.

woodruffw3y ago

> The ad hoc, out of band nature of how PGP keys are trusted is not remotely new - PyPi would have known from the very start of adopting PGP that many keys would not be automatically verifiable. It makes little sense to turn around now and act like this is some surprising thing.

This is revisionist: in 2005, PGP was approachingly modern and represented an acceptable tradeoff between usability, legal and patent constraints, and arms laws. It was also accompanied by a network of synchronizing keyservers and a "strong set" within the Web of Trust that, in principle, gave you transitive authenticity for artifacts. That never really worked as expected, but it's all code and infrastructure that was actually running in 2005, when PyPI chose to allow PGP signatures.

None of that is the case in 2023: PGP is 20 years behind cryptographic best practices, and has 30 years of unresolved technical debt. There is no web of trust, and the synchronizing keyserver network has been broken for years.

The argument for PGP in 2005 was that it was, to a first approximation, the best that could be done. The argument against PGP in 2023 is that, to a first approximation, it's worse than useless by virtue of providing false security guarantees.

eduction3y ago

There wasn't any sign or promise in 2005 (that I'm aware of, and I had a key back then) that PGP would soon get meaningfully better, for example that it would suddenly, magically solve the issue of verifying keys in a decentralized system. It was a pain in the arse then as now. There has been some improvement in the meantime, in that keys.openpgp.org started unilaterally doing its own email verification.

And you say that it's been 18 years and PGP is behind best practices, but you don't describe how those best practices would better solve the package verification challenge that PyPi faces. So in the absence of an actual alternative system why not keep using PGP? Perfect is the enemy of good (IMO!).

But again, I'm not even really feeling strongly that PyPi should use PGP or not. I mostly posted just to say that they should be honest about why they are leaving it, and these seem like bad/misleading stats that for some reason they are hiding behind instead of coming out and saying they changed their mind about PGP (or new people are now running things and don't like dealing with PGP - many people would sympathize).

woodruffw3y ago

> And you say that it's been 18 years and PGP is behind best practices, but you don't describe how those best practices would better solve the package verification challenge that PyPi faces. So in the absence of an actual alternative system why not keep using PGP? Perfect is the enemy of good (IMO!).

This may or may not be satisfying to you, but there is discussion around this, both in this thread, other threads on the internet, and PyPI's own issue tracker. The current plan is to integrate Sigstore[1] into PyPI as a more complete and modern codesigning solution. That work is progressing, and is not in a state that's meant to "replace" PGP. But that's intentional because, as the post states, nobody (to a first approximation) was using these PGP signatures anyways.

Perfect is indeed the enemy of the good; the other enemy of the good is bad things. PGP is bad; the reason I titled the original post "worse than useless" is because it takes a useless security feature (signatures that nobody verifies) and makes them actively dangerous by providing cryptographic margins that weren't even safe 25 years ago.

> But again, I'm not even really feeling strongly that PyPi should use PGP or not. I mostly posted just to say that they should be honest about why they are leaving it, and these seem like bad/misleading stats that for some reason they are hiding behind

Two things should be separated here: there's the PyPI blog post, which is written by the PyPI admins, and there's the "worse than useless" blog post, which was written by me. I am not an admin of PyPI, and it's my independent technical opinion that PGP is bad. I stand by the stats that I've included in my own post, but I do welcome specific critiques of how they're bad or misleading.

The PyPI admins can provide their own rationale, but this is my best understanding: they have known for years that PGP is bad, and have more or less tolerated it as a legacy feature because removing it was a low priority. The post I wrote two days ago was just a "final nudge" towards removing it, since the post's statistics (particularly large numbers of expired keys) refute one of the last defenses for PGP on PyPI.

upofadown3y ago

>PGP is 20 years behind cryptographic best practices...

In what sense? If someone signs a package with, say, a RSA key, how is that behind in some way?

>30 years of unresolved technical debt.

How can a standard for a file/message format have technical debt. PGP is dead simple. Where is this debt hidden?

woodruffw3y ago

> In what sense? If someone signs a package with, say, a RSA key, how is that behind in some way?

OpenPGP specifies PKCS#1 v1.5 for RSA padding. Attacks on PKCS#1 v1.5 have been well understood for over 20 years[1]; every few years, someone finds a new one.

RSA itself is well-known for having weird number-theoretic problems that implementations have failed to respect, to catastrophic effects. Best practice for algorithm selection is to pick algorithms where users can't compromise the integrity of the scheme through poor public parameter selection; RSA forces the user to pick a public modulus and exponent, leading to all kinds of silly things that actually happen[2].

Edit: Correcting myself: most attacks on v1.5 padding concern encryption, not signatures. The general fragility argument remains, however.

[1]: https://en.wikipedia.org/wiki/PKCS_1#Attacks

[2]: https://news.ycombinator.com/item?id=5993959

https://wiki.debian.org/Teams/Apt/Spec/AptSign

zamnos3y ago

PGP's extremely poor UX would suggest there's code that doesn't exist that should.

tptacek3y ago

There are much, much better solutions for packaging!

5e92cb50239222b3y ago

Like signify (developed and adopted by OpenBSD) and minisign.

AFAIK Debian has been working on abandoning GPG in favor of something very similar to those two. Not sure when it's going to be shipped, though.

eduction3y ago

Which you do not describe, but set that aside: A post that honestly said "we do not like PGP, but here is our alternate plan" would be great. On an actual better solution, I don't think anyone has proposed a good one. Here is the closest I've seen from PyPi (or at least linked from this post as describing their thinking), from 10 years ago:

"Everything is Terrible So What Do We Do?

Bluntly put, I don’t know for sure. This isn’t an already solved problem nor is it an easy to solve one."

https://caremad.io/posts/2013/07/packaging-signing-not-holy-...

What I'll say on PGP is the perfect is the enemy of the good. It's not a tech anyone has much fun using, but in a group setting, used regularly, I have found it can fade into the background at least. I don't want to go any further down the "is PGP good or bad" rabbit hole than that.

But if you have a better solution for package security, please do describe it here.

donaldstufft3y ago

The current documented plans revolve around TUF (https://peps.python.org/pep-0458/, https://peps.python.org/pep-0480/). Those links have probably bit rotted a bit by now, progress has been slow on implementing them for a number of reasons (mostly OSS reasons, volunteers etc).

There's also a general consensus (not documented) that sigstore will play some kind of role here. Possibly in-toto as well?

In the 10 years since my post that you referenced, we've laid some decent plans I believe, and have just slowly been working on them, to the extent that we've been able to given our own time constraints.

tptacek3y ago

It's not really up to you or me, it's up to PyPI. For my part: their logic seems pretty sound.

c0l03y ago

And which of those are in use/available at the Python Package Index?

tptacek3y ago

I don't know, but PGP isn't either, so the comparison holds.

rwmj3y ago· 12 in thread

So they're removing PGP signatures, which certainly have some issues, and replacing them with ... nothing?

sowbug3y ago

The research article cited in the announcement is titled "PGP signatures on PyPI: worse than useless."

That's the issue. Pretending there is a security solution in place is worse than being upfront that there is none. If you look down and notice that your seatbelt is actually made out of angel hair pasta, you might drive more carefully. Hopefully you'll also get a better car.

rwmj3y ago

But they're not "worse than useless", that article was wrong. PGP/GPG are without doubt problematic, they have weak points (like use of SHA-1, some keys that could not be located, and terrible UI) but they are not worse than having no traceability of the package at all between the author and PyPI.

sowbug3y ago

The system guaranteed that a key signed a package. That was its entire utility.

At best, it defeated plausible deniability for package maintainers who had avowed public keys, but then somehow signed a bad package. This wouldn't have stopped the malware from getting onto your system. It only would have led you to the hapless (but honest) package maintainer.

It didn't stop someone who is not you from generating a PGP key for Richard WM Jones, signing malware, uploading to PyPi, and then disappearing back under the rock where they live. And if you believe this system is not useless, then you also believe that at least one person out there was not dissuaded from installing that malware because "Hey, someone named Richard WM Jones went through the trouble of signing it!"

As is often the case, the value of this system depends on your threat model. I'm not too worried about someone going rogue from the tiny population of people who were using PGP correctly. But I am worried about using a platform that claimed to have signing infrastructure, when that infrastructure had no meaningful checks on who was signing.

woodruffw3y ago

> they are not worse than having no traceability of the package at all between the author and PyPI.

Except that they are: PGP does not give you this kind of identity relationship. The most it can give you is an association to a key ID, which is (1) brute-forceable, and (2) not strongly bound to any actual user or machine identity.

The only thing worse than an unsecured scheme is an insecure scheme that lulls users into a false sense of security and authenticity. PGP signatures on PyPI are the latter.

bandrami3y ago

It's like the larger holy war against self-signed certificates in TLS. They are strictly better than plaintext but there is software that will prefer a plaintext connection to self-signed TLS.

im3w1l3y ago

I think another thing with pgp is that it's in this awkward place where it's bad enough that few people use it, but good enough that it prevents someone from making an alternative.

akerl_3y ago

Nobody's making a PGP alternative because a major part of what makes PGP bad is that it tries to be a generic solution to every problem, when in practice signing and encryption workflows are incredibly domain-specific.

People are continuously creating better tools for domains that historically saw PGP usage. To name a few: Signal for short-form messaging, age for file encryption, signify/minisign for artifact signing.

speedgoose3y ago

The article is closer to a blog post than a multiple authors peer reviewed research paper published in a high impact journal.

woodruffw3y ago

That's because it is a blog post. It isn't advertised as anything else.

[1]: https://www.youtube.com/watch?v=jjAq7S49eow&t=1s

justin_oaks3y ago

Inasmuch as PGP signatures are rarely used and even more rarely useful, I don't think it's a problem to remove them and replace them with nothing. If it is a problem, it's been a problem for a long time and it's not really making the situation meaningfully worse to remove them.

That said, if PGP signatures are to be replaced then there's no reason why they can't be removed now and replaced with something later.

bombolo3y ago

Sooner or later they will ask a photo of a passport… google's idea, from their security blog.

paulddraper3y ago

PGP signatures purpose is to remove the dependency on trusting PyPI, i.e. protecting against PyPI getting hacked.

(Note: PyPI protects against MITM with HTTPS.)

Removing this is predicated on the idea that is a low priority threat vector.

westurner3y ago· 7 in thread

Now that you have removed GPG ASC signature upload support, is there any way for publishers to add cryptographic signatures to packages that they upload to pypi? FWIU only "the server signs uploads" part of TUF was ever implemented?

Why do we use GPG ASC signatures instead of just a checksum over the same channel?

woodruffw3y ago

> Why do we use GPG ASC signatures instead of just a checksum over the same channel?

Could you elaborate on what you mean by this? PyPI computes and supplies a digest for every uploaded distribution, so you can already cross-check integrity for any hosted distribution.

GPG was nominally meant to provide authenticity for distributions, but it never really served this purpose. That's why it's being removed.

westurner3y ago

> Why do we use GPG ASC signatures instead of just a checksum over the same channel?

You can include an md5sum or a sha512sum string next to the URL that the package is downloaded from (for users to optionally check after downloading a package); but if that checksum string is uploaded over the same channel (HTTPS/TLS w/ a CA cert bundle) as the package, the checksum string could have been MITM'd/tampered with, too. A cryptographically-signed checksum can be verified once the pubkey is retrieved over a different channel (GPG: HKP is HTTPS/TLS with cert pinning IIRC), and a MITM would have to spend a lot of money to forge that digital publisher signature.

Twine COULD/SHOULD download uploads to check the PyPI TUF signature, which could/should be shipped as a const in twine?

And then Twine should check publisher signatures against which trusted map of package names to trusted keys?

westurner3y ago

1) the server signs what's uploaded using one or more TUF keys shared to RAM on every pypi upload server.

2) the client uploads a cryptographic signature (made using their own key) along with the package, and the corresponding public key is trusted to upload for that package name, and the client retrieves said public key and verifies the downloaded package's cryptographic signature before installing.

FWIU, 1 (PyPI signs uploads with TUF) was implemented, but 2 (users sign their own packages before uploading the signed package and signature, (and then 1)) was never implemented?

woodruffw3y ago

Your understanding is a little off: we worked on integrating TUF into PyPI for a while, but ran into some scalability/distributability issues with the reference implementation. It's been a few years, but my recollection was that the reference implementation assumed a lot of local filesystem state, which wasn't compatible with Warehouse's deployment (no local state other than tempfiles, everything in object storage).

To the best of my knowledge, the current state of TUF for PyPI is that we performed a trusted setup ceremony for the TUF roots[1], but that no signatures were ever produced from those roots.

For the time being, we're looking at solutions that have less operational overhead: Sigstore[2] is the main one, and it uses TUF under the hood to provide the root of trust.

[2]: https://www.sigstore.dev/

ilyt3y ago

Signature tells you who signed it.

Of course, if you haven't put any effort in system to end-to-end verify whether it's right signature it doesn't matter.

westurner3y ago

pip checks that a given was signed with the pypi key but does not check for a signature from the publisher. And now there's no way to host any type of cryptographic signatures on pypi.

There is no e2e: pypi signs what's uploaded.

(Noting also that packages don't have to be encrypted in order to have cryptographic signatures; only the signature is encrypted, not the whole package)

ilyt3y ago

Yeah the whole thing looks like throwing away baby with bathwater; the package should

* get a signature for author ("the actual author published it") + some metadata with list of valid signing keys (in case project have more authors or just for key rotation * get a signature for hosting provider that confirms "yes, that actual user logged in and uploaded the package" * (the hardest part) key management on client side so the user have to do least amount of work possible in when downloading/updating valid package.

If user doesn't want to go to effort to validate whether the public key of author is valid so be it but at very least system should alert on tampering with the provider (checking the hosting signature) or the author key changing (compromised credentials to the hosting provider).

It still doesn't prevent "the attacker steals key off author's machine" but that is by FAR the rarest case and could be pretty reasonably prevented by just using hardware tokens. Hell, fund them for key contributors.

Arnavion3y ago· 4 in thread

>Of those 1069 unique keys, about 30% of them were not discoverable on major public keyservers, making it difficult or impossible to meaningfully verify those signatures.

I don't know if it applies to any of those 1069 keys, but note that there is a way of hosting PGP keys that does not depend on key servers: WKD https://datatracker.ietf.org/doc/draft-koch-openpgp-webkey-s... . You host the key at a .well-known URI under the domain of the email address. It's a draft as you can see, but I've seen a few people using it (including myself), and GnuPG supports it.

woodruffw3y ago

This is interesting, but it doesn't really solve the key distribution problem: with well-known hosting you now have a (weak) binding to some DNS path, but you're not given any indication of how to discover that path. It's also not clear that a DNS identity is beneficial to PyPI in particular (since PyPI doesn't associate or namespace packages at all w/r/t DNS).

More generally, these kinds of improvements are not a sufficient reason to retain PGP: even with a sensible identity binding and key distribution, PGP is still a mess under the hood. The security of a codesigning scheme is always the weakest signature and/or key, and PGP's flexibility more or less ensures that that weakest link will always be extremely weak.

Arnavion3y ago

Right. I've never used PyPI, but TFA makes it sound like the existing support for signing is "We allow the uploader to upload a signature, and the downloader can look up the key indicated in the signature to do the verification." Is that correct? If so, then yes there is a key ID involved but no email address, so a generic downloader would have no choice but to look it up from a key server.

woodruffw3y ago

That's correct!

PyPI's support for PGP is very old -- it's hard to get an exact date, but I think it's been around since the very earliest versions of the index (well before it was a storing index like it is now). If I had to guess (speculate wildly), my guess would be that the original implementation was done with a healthy SKS network and strong set in mind -- without those things, PGP's already weak identity primitives are more or less nonexistent with just signatures.

Avamander3y ago

You can't do WKD with just signatures, there are no identities associated with the signature to just look up.

usr11063y ago· 4 in thread

Isn't that throwing out the baby with the bathwater? There seem to be non-neglible risks of installing malware from PyPI according to various headlines recently. But instead of improving security measures that don't work well they just remove them?

donaldstufft3y ago

Removing security features that don't work is a separate concern from making security features that do work. Nobody who has done any serious work on PyPI security in the past 15 years thinks that GPG will play a part in the future of PyPI security. It's support was entirely vestigial, served no practical purpose, and never would.

chatmasta3y ago

Most supply chain attacks rely on dependency confusion or typo-squatting, which PGP signing doesn't solve. An attacker can PGP sign their typosquatted package, and the package manager won't know to alert you because as far as it can tell, you intended to install that package. (This is before even considering whether the packages are signed with strong keys, or users are actually verifying them against any public trust store.) That's one reason supply chain issues are so pernicious - they're more of a human problem than a technical one.

That said, I do agree with your premise that the limited usefulness of PGP signing doesn't necessitate removing the feature entirely.

masklinn3y ago

> Isn't that throwing out the baby with the bathwater?

That assumes there’s a baby in the bath water.

> But instead of improving security measures that don't work well they just remove them?

Well yes, “security measures” which don’t work are usually worse than nothing.

Brian_K_White3y ago

There are many cases where it's better to know you don't have something correctly than think you have something incorrectly. Security is certainly one.

jpgvm3y ago· 4 in thread

I don't understand how Java can get this right with Maven Central and co but newer languages can't.

Having a slight barrier to entry which is essentially "you must learn why signing is important for users of your library and this is how to do it", a) really isn't that bad and b) doesn't result in less quality packages being uploaded c) if it acts like any sort of filter that seems to be a good thing.

Maven Central isn't short of high quality packages and no high quality OSS Java libraries are missing so the filter aspect isn't culling anything important.

Java, Apt, RPM, etc all have this and have absolutely gigantic numbers of packages so the argument that it's too hard really just doesn't hold water.

Doing so requires reading/understanding these ~3 pages of docs: https://central.sonatype.org/publish/requirements/gpg/

B1FF_PSUVM3y ago

> newer languages can't.

Python (1991) is older than Java (1995)

(irrelevant factoid, but still ...)

donaldstufft3y ago

I don't believe that Maven Central's use of GPG is providing a meaningful security control here, so I would dispute the idea that they're doing it "right".

jpgvm3y ago

At the very least there are a) more active keys b) those keys are available on keyservers and c) it's being used by the major packages in the ecosystem correctly. i.e Spring, Jackson, Quarkus, Logback, Apache-sphere, Google-sphere, etc.

So while it might not be providing meaningful security for lower-tier packages it's definitely doing it's job for top tier packages like these that are relied on by hundreds of thousands of projects.

blibble3y ago

> I don't understand how Java can get this right with Maven Central and co but newer languages can't.

it's the magic combination of pushing their own agenda (vs. that of their users), mixed with ineptitude

WhyNotHugo3y ago· 4 in thread

When many developers didn't use 2FA they pushed for them to enable 2FA within a deadline. It sounds like the same approach could have been used for PyPI. E.g.: an attempt to make the feature useful before declaring it dead forever.

woodruffw3y ago

This has very little to do with 2FA: PGP signing has been de facto dead for years on PyPI, and this change has no effect on publishing workflows: PyPI will still accept uploads that contain signatures, and just ignores them now.

It's also not accurate to say that PyPI failed to make 2FA useful: it was deployed for over two years before the 2FA mandate for critical projects went into effect. That mandate also came with free hardware keys for everyone affected.

masklinn3y ago

No. 2FA is a feature for pypi, and developers. The entire purpose of pgp sigs was external, it was for distributions to use.

Distributions don’t use it, therefore it’s worthless, just just overhead and technical debt.

LtWorf3y ago

Debian checks PGP signatures of releases.

adamckay3y ago

For Python packages served by PyPI?

KRAKRISMOTT3y ago· 4 in thread

What are we switching to? Does Pypi support ECDSA?

woodruffw3y ago

Just for disambiguation: ECDSA is a signing algorithm, not a protocol or toolkit like PGP. PGP can produce ECDSA signatures through an extension RFC, but it's not a core part of OpenPGP.

There is no immediate replacement, because the overwhelming majority of packages never bothered to sign with PGP (and all evidence points to the overwhelming majority of signatures never being verified). In other words, this is much closer to removing "dead" code than to killing an active feature.

Longer term, the plan is to integrate Sigstore[1]-based signatures.

jossclimb3y ago

sigstore I hope.

aborsy3y ago

I’m not sure if I understand this correctly, but, this basically seems to be a CA, with SSO-type of proof of identity, short lived certificates and transparency logs?

How an OIDC identity is obtained and secured is not treated. It brings useful organization to PKI, but the problem remains. You have to delegate trust to identity providers: Google, GitHub, etc.

Keybase was interesting, but the project seems semi-dead.

woodruffw3y ago

> How an OIDC identity is obtained and secured is not treated. It brings useful organization to PKI, but the problem remains. You have to delegate trust to identity providers: Google, GitHub, etc.

Yes, this is a fundamental (and, IMO, reasonable) assumption in Sigstore. The trust argument for large IdPs is that they (1) have the institutional ability and resources (like incident response) to maintain their service, (2) have strong incentives to maintain and improve the overall security of their providers (billions of accounts on the Internet are bound to SSO via Google, etc.), and (3) that any failures in those providers are already catastrophic, so reducing the number of moving and potentially failing parts is a net win in terms of security.

[1]: https://gist.github.com/rjhansen/67ab921ffb4084c865b3618d695...

zzzeek3y ago· 3 in thread

> Of those 1069 unique keys, about 30% of them were not discoverable on major public keyservers, making it difficult or impossible to meaningfully verify those signatures. Of the remaining 71%, nearly half of them were unable to be meaningfully verified at the time of the audit (2023-05-19) 2.

so...*reject those packages*. if you use a PGP key that isn't properly available or verifiable, reject it. That way every package with a PGP key will have 100% "key is properly discoverable" rate.

it's not really reasonable to just drop this feature because most packages don't use it. Packages with tens of millions of downloads (like mine) make up a small percentage of total packages, but this small number of packages makes up a huge proportion of actual downloads, and package signing is most useful for these kinds of packages.

if the adoption of "proper PGP keys" were ranked by packages/ downloads rather than "packages" alone, these rates would be much different.

donaldstufft3y ago

I don't believe they would.

Looking at the top 20 packages in the last month by download (packages with hundreds of millions of downloads), only 1 of them shipped a GPG signature with their most recent release. I haven't asked the author of that one, but I do know them and I suspect they agree with the idea that it's not a valuable thing and they do it largely because it exists.

oefrha3y ago

> they do it largely because it exists.

That’s me. I used to upload signatures to PyPI only because it’s a thing that exists and it’s not much trouble. I’d be counted among the valid 36%, but I doubt anyone ever verified even one of the hundreds of sigs I uploaded over the years. I eventually stopped due to the pointlessness.

7to23y ago

That quote doesn't make any sense even if we stopped at the first part. I PGP-sign my packages and my key is not on any public key server. It's on my website. This reasoning lacks rigor and seems to only serve as an excuse to remove a feature that some pypi devs didn't like without offering an alternative for security guarantees that it provided.

reidrac3y ago· 3 in thread

I have been thinking about this in the context of Java libraries (really using Scala, but bear with me).

If the repo requires a GPG signature, they could also ask for the public key of the developer making the releases (e.g. when they make the account), and they could sign it with their key at that point.

Then make available the package, the signature, and the signed public key. Then I only need to trust the repo's key (in this case PyPi).

Does this make any sense?

woodruffw3y ago

> Does this make any sense?

It makes sense in terms of trusting the package index, but it's inverted from the original design goal: the point of end-user signatures on package indices is to eliminate unnecessary package index trust, not reinforce it.

If you already trust the package index, then mandating HTTPS and strong cryptographic digests is going to be far more effective (and secure) than some kind of PGP key attestation scheme.

reidrac3y ago

The package index only hosts the packages, but doesn't release them. The dev releasing the package is who signs it.

Without an easy way to verify the keys, the signatures are useless. Which is why PiPy is removing the GPG keys all together.

woodruffw3y ago

> The package index only hosts the packages, but doesn't release them. The dev releasing the package is who signs it.

I know that; the GP is describing a countersigning scheme, where the package index (qua trusted entity) countersigns for the signing key, which the dev then uses to sign for their package.

> Without an easy way to verify the keys, the signatures are useless. Which is why PiPy is removing the GPG keys all together.

Agreed entirely; I'm the one who wrote the analysis in the linked announcement :-)

rvz3y ago· 3 in thread

PGP is a solution in search of a problem. We have given it decades for it to be useful and it turns out that it is an enormous security failure. It needed to go.

Sigstore [0] on the other hand makes more sense to use instead of problem.

[0] https://www.sigstore.dev

msm_3y ago

This reads like an advertisement. I routinely use GPG, and it is useful for me. It's not perfect (far from perfect, really), but it's a solution for multiple of my problems.

I don't know much about the solution you promote, but as usual with many "PGP killers" it replaces one very specific application of PGP and ignores all the others. Which is ok! Doing one thing and doing it well is the Unix philosophy after all. But it's not something I have use for, and it's not a viable replacement for GPG.

tptacek3y ago

If doing one thing and doing it well is the core of the Unix philosophy, PGP is (cryptographically) the antithesis of that. It's a Swiss Army Knife that does none of its tasks well by modern standards.

LtWorf3y ago

I'll let my boss know we must stop signing our releases and having our software automatically check if the new version is legit then.

We will instead switch to use some thing with a fluffy corporate website that tells absolutely nothing.

bryanlyon3y ago· 2 in thread

I came here thinking they were removing the PGP package from PyPi, but they're just removing a barely-used signature system? I don't know why they have to remove it though. I doubt it requires much maintenance now that it's already in place.

Even if only 37% of keys are verifiable, that's infinitely more than will be verifiable if they remove the PGP support.

tedivm3y ago

They address your comment directly in their post-

> While it doesn't represent a massive operational burden to continue to support it, it does require any new features that touch the storage of files to be made aware of and capable of handling these PGP signatures, which is a non zero cost on the maintainers and contributors of PyPI.

Avamander3y ago

> Even if only 37% of keys are verifiable, that's infinitely more than will be verifiable if they remove the PGP support.

Discoverable. That does not really verify anything about the key, its identities or the supposed signer.

It boils down to almost entirely to just an overcomplicated hashing system.

zokier3y ago· 1 in thread

At least you can't blame pypi for ignoring the report, and tbh I find this response time remarkably quick. It wouldn't have been far fetched to imagine someone in their position just trying to ignore/downplay/dispute this sort of reports.

masklinn3y ago

As the author of the post noted above, the pypi maintainers have been wanting to get rid of pgp for awhile.

The post gave them excellent additional justification to.

NotYourLawyer3y ago· 1 in thread

Interesting timeline. The Yossarian article that TFA cites and that I assumed was the impetus here was published two days ago on 5/21. But the audit was two days earlier on 5/19.

woodruffw3y ago

I originally ran the audit on 3/27 (IIRC), and then ran it a few additional times as I fixed data quality issues in my scripts (the ones linked in the post). The last time I ran it was on 5/19-5/20, when I was finalizing the post. You can also see that I did a new release of `pgpkeydump` at around the same time, to add some more extracted datapoints.

PyPI's admins have been wanting to remove PGP support for years; all I did was provide the final nudge.

jwilk3y ago

Two days ago: https://news.ycombinator.com/item?id=36021172 ("PGP signatures on PyPI: worse than useless", >150 comments)

jxy3y ago

I don't understand the argument. Isn't the whole point of PGP establishing some kind of chain of trust? If pypi.org has it's public key, it could sign a few major distributors's keys, and for smaller/individual packages I could either choose to always trust the same public key or don't use the package. It's not a centralized system to begin with. It's not pypi.org's responsibility to identify and verify all the keys belong to who say they belong. Pypi.org's unable to verify individual identities shouldn't impact the overall usefulness of the PGP for package distribution and verification.

sacnoradhq3y ago

So how are Python packages signed? Are they just shipping rando code without any sort of E2E assurance?

FWIW, Ruby also did a piss-poor job of handling gem signing by making it both difficult and optional.

How fucking hard is it to get to the level of code release assurance as Debian or Fedora? Manage GPG keys, signfest them, and enforce a policy.

7to23y ago

Trust on first use is absolutely a valid use of PGP signatures that is being used in many real world systems (ask me how I know). You finding that PGP isn't being used they way you think it should does not justify removing it without providing a replacement.

Why on earth wasn't the community asked before you implemented this change?

> Given all of this, the continued support of uploading PGP signatures to PyPI is no longer defensible. While it doesn't represent a massive operational burden to continue to support it, it does require any new features that touch the storage of files to be made aware of and capable of handling these PGP signatures, which is a non zero cost on the maintainers and contributors of PyPI.

This uninformed reasoning is what's indefensible.

forgotmypw173y ago

What an amazing opportunity for someone to add a new way of integrating PGP authentication by writing two short scripts:

One to compile a list of file hashes and PGP-sign them.

One to validate these hashes against the provided signatures.

lgxz3y ago

Replace a 31% effective solution with no solution? very impressive

j / k navigate · click thread line to collapse

187 comments

99 comments · 21 top-level

tzs3y ago· 13 in thread

Why not include the public key in the package?

woodruffw3y ago

> Why not include the public key in the package?

zimmerfrei3y ago

Nope, you assume wrong. That's exactly what I (also) want, that is, knowing that the *authors* remained the same, whoever they are.

>> What most people actually want is a strong cryptographic attestation that the package distribution came from the same source as the thing hosting the source code

Nope, nobody really needs more of that, since that's what's your HTTPS certificate is for.

People *really* want to mitigate the risk of pypi infrastructure getting fully compromised, which is very likely, given how many eggs you keep in the same basket there.

PGP signatures were the last ditch, not very convenient but also not as bad as they are painted. But from now on there will be not even that very little.

4 more replies

slaymaker19073y ago

woodruffw3y ago

> There is some security even if they provide the public key.

donaldstufft3y ago

specialist3y ago

> Because PyPI ... could always substitute a new key.

Isn't that what public key servers are for?

For publishing my FOSS to sonatype, I had to first publish my public key, eg keyserver.ubuntu.com.

I don't know PyPI, but from this OC, it sounds like PyPI does not have the same prerequisite.

woodruffw3y ago

[2]: https://news.ycombinator.com/item?id=36021172

alerighi3y ago

hannob3y ago

What happens if the developer looses his key? Or if it expires?

pypi could show a warning that the key has changed. Which is not an actionable or helpful warning. And then everyone gets used to seeing these warnings every now and then. And you won nothing.

Getting signatures to do something useful is hard.

bombolo3y ago

> What happens if the developer looses his key? Or if it expires?

What happens if a developer loses their google titan key that is required to login into pypi?

hannob3y ago

still_grokking3y ago

Did I just witness the invention of a kind of "software package blockchain"?

If would be btw. a proper but sustainable prove of work blockchain. As you would need in most cases to pay developers to "mint new blocks".

OK, maybe let's forget about the blockchain. It's a loaded term. But the idea of software signature TOFU sounds indeed good!

chaxor3y ago

eduction3y ago· 13 in thread

This has the smell of "we didn't want to bother supporting PGP any more because it's hard so we came up with an excuse."

woodruffw3y ago

eduction3y ago

woodruffw3y ago

upofadown3y ago

>PGP is 20 years behind cryptographic best practices...

In what sense? If someone signs a package with, say, a RSA key, how is that behind in some way?

>30 years of unresolved technical debt.

How can a standard for a file/message format have technical debt. PGP is dead simple. Where is this debt hidden?

woodruffw3y ago

> In what sense? If someone signs a package with, say, a RSA key, how is that behind in some way?

OpenPGP specifies PKCS#1 v1.5 for RSA padding. Attacks on PKCS#1 v1.5 have been well understood for over 20 years[1]; every few years, someone finds a new one.

Edit: Correcting myself: most attacks on v1.5 padding concern encryption, not signatures. The general fragility argument remains, however.

[1]: https://en.wikipedia.org/wiki/PKCS_1#Attacks

[2]: https://news.ycombinator.com/item?id=5993959

https://wiki.debian.org/Teams/Apt/Spec/AptSign

zamnos3y ago

PGP's extremely poor UX would suggest there's code that doesn't exist that should.

tptacek3y ago

There are much, much better solutions for packaging!

5e92cb50239222b3y ago

Like signify (developed and adopted by OpenBSD) and minisign.

AFAIK Debian has been working on abandoning GPG in favor of something very similar to those two. Not sure when it's going to be shipped, though.

eduction3y ago

"Everything is Terrible So What Do We Do?

Bluntly put, I don’t know for sure. This isn’t an already solved problem nor is it an easy to solve one."

https://caremad.io/posts/2013/07/packaging-signing-not-holy-...

But if you have a better solution for package security, please do describe it here.

donaldstufft3y ago

There's also a general consensus (not documented) that sigstore will play some kind of role here. Possibly in-toto as well?

tptacek3y ago

It's not really up to you or me, it's up to PyPI. For my part: their logic seems pretty sound.

c0l03y ago

And which of those are in use/available at the Python Package Index?

tptacek3y ago

I don't know, but PGP isn't either, so the comparison holds.

rwmj3y ago· 12 in thread

So they're removing PGP signatures, which certainly have some issues, and replacing them with ... nothing?

sowbug3y ago

The research article cited in the announcement is titled "PGP signatures on PyPI: worse than useless."

rwmj3y ago

sowbug3y ago

The system guaranteed that a key signed a package. That was its entire utility.

woodruffw3y ago

> they are not worse than having no traceability of the package at all between the author and PyPI.

The only thing worse than an unsecured scheme is an insecure scheme that lulls users into a false sense of security and authenticity. PGP signatures on PyPI are the latter.

bandrami3y ago

It's like the larger holy war against self-signed certificates in TLS. They are strictly better than plaintext but there is software that will prefer a plaintext connection to self-signed TLS.

im3w1l3y ago

I think another thing with pgp is that it's in this awkward place where it's bad enough that few people use it, but good enough that it prevents someone from making an alternative.

akerl_3y ago

speedgoose3y ago

The article is closer to a blog post than a multiple authors peer reviewed research paper published in a high impact journal.

woodruffw3y ago

That's because it is a blog post. It isn't advertised as anything else.

[1]: https://www.youtube.com/watch?v=jjAq7S49eow&t=1s

justin_oaks3y ago

That said, if PGP signatures are to be replaced then there's no reason why they can't be removed now and replaced with something later.

bombolo3y ago

Sooner or later they will ask a photo of a passport… google's idea, from their security blog.

paulddraper3y ago

PGP signatures purpose is to remove the dependency on trusting PyPI, i.e. protecting against PyPI getting hacked.

(Note: PyPI protects against MITM with HTTPS.)

Removing this is predicated on the idea that is a low priority threat vector.

westurner3y ago· 7 in thread

Why do we use GPG ASC signatures instead of just a checksum over the same channel?

woodruffw3y ago

> Why do we use GPG ASC signatures instead of just a checksum over the same channel?

Could you elaborate on what you mean by this? PyPI computes and supplies a digest for every uploaded distribution, so you can already cross-check integrity for any hosted distribution.

GPG was nominally meant to provide authenticity for distributions, but it never really served this purpose. That's why it's being removed.

westurner3y ago

> Why do we use GPG ASC signatures instead of just a checksum over the same channel?

Twine COULD/SHOULD download uploads to check the PyPI TUF signature, which could/should be shipped as a const in twine?

And then Twine should check publisher signatures against which trusted map of package names to trusted keys?

westurner3y ago

1) the server signs what's uploaded using one or more TUF keys shared to RAM on every pypi upload server.

FWIU, 1 (PyPI signs uploads with TUF) was implemented, but 2 (users sign their own packages before uploading the signed package and signature, (and then 1)) was never implemented?

woodruffw3y ago

To the best of my knowledge, the current state of TUF for PyPI is that we performed a trusted setup ceremony for the TUF roots[1], but that no signatures were ever produced from those roots.

For the time being, we're looking at solutions that have less operational overhead: Sigstore[2] is the main one, and it uses TUF under the hood to provide the root of trust.

[2]: https://www.sigstore.dev/

ilyt3y ago

Signature tells you who signed it.

Of course, if you haven't put any effort in system to end-to-end verify whether it's right signature it doesn't matter.

westurner3y ago

pip checks that a given was signed with the pypi key but does not check for a signature from the publisher. And now there's no way to host any type of cryptographic signatures on pypi.

There is no e2e: pypi signs what's uploaded.

(Noting also that packages don't have to be encrypted in order to have cryptographic signatures; only the signature is encrypted, not the whole package)

ilyt3y ago

Yeah the whole thing looks like throwing away baby with bathwater; the package should

Arnavion3y ago· 4 in thread

>Of those 1069 unique keys, about 30% of them were not discoverable on major public keyservers, making it difficult or impossible to meaningfully verify those signatures.

woodruffw3y ago

Arnavion3y ago

woodruffw3y ago

That's correct!

Avamander3y ago

You can't do WKD with just signatures, there are no identities associated with the signature to just look up.

usr11063y ago· 4 in thread

donaldstufft3y ago

chatmasta3y ago

That said, I do agree with your premise that the limited usefulness of PGP signing doesn't necessitate removing the feature entirely.

masklinn3y ago

> Isn't that throwing out the baby with the bathwater?

That assumes there’s a baby in the bath water.

> But instead of improving security measures that don't work well they just remove them?

Well yes, “security measures” which don’t work are usually worse than nothing.

Brian_K_White3y ago

There are many cases where it's better to know you don't have something correctly than think you have something incorrectly. Security is certainly one.

jpgvm3y ago· 4 in thread

I don't understand how Java can get this right with Maven Central and co but newer languages can't.

Maven Central isn't short of high quality packages and no high quality OSS Java libraries are missing so the filter aspect isn't culling anything important.

Java, Apt, RPM, etc all have this and have absolutely gigantic numbers of packages so the argument that it's too hard really just doesn't hold water.

Doing so requires reading/understanding these ~3 pages of docs: https://central.sonatype.org/publish/requirements/gpg/

B1FF_PSUVM3y ago

> newer languages can't.

Python (1991) is older than Java (1995)

(irrelevant factoid, but still ...)

donaldstufft3y ago

I don't believe that Maven Central's use of GPG is providing a meaningful security control here, so I would dispute the idea that they're doing it "right".

jpgvm3y ago

So while it might not be providing meaningful security for lower-tier packages it's definitely doing it's job for top tier packages like these that are relied on by hundreds of thousands of projects.

blibble3y ago

> I don't understand how Java can get this right with Maven Central and co but newer languages can't.

it's the magic combination of pushing their own agenda (vs. that of their users), mixed with ineptitude

WhyNotHugo3y ago· 4 in thread

woodruffw3y ago

masklinn3y ago

No. 2FA is a feature for pypi, and developers. The entire purpose of pgp sigs was external, it was for distributions to use.

Distributions don’t use it, therefore it’s worthless, just just overhead and technical debt.

LtWorf3y ago

Debian checks PGP signatures of releases.

adamckay3y ago

For Python packages served by PyPI?

KRAKRISMOTT3y ago· 4 in thread

What are we switching to? Does Pypi support ECDSA?

woodruffw3y ago

Just for disambiguation: ECDSA is a signing algorithm, not a protocol or toolkit like PGP. PGP can produce ECDSA signatures through an extension RFC, but it's not a core part of OpenPGP.

Longer term, the plan is to integrate Sigstore[1]-based signatures.

jossclimb3y ago

sigstore I hope.

aborsy3y ago

I’m not sure if I understand this correctly, but, this basically seems to be a CA, with SSO-type of proof of identity, short lived certificates and transparency logs?

How an OIDC identity is obtained and secured is not treated. It brings useful organization to PKI, but the problem remains. You have to delegate trust to identity providers: Google, GitHub, etc.

Keybase was interesting, but the project seems semi-dead.

woodruffw3y ago

> How an OIDC identity is obtained and secured is not treated. It brings useful organization to PKI, but the problem remains. You have to delegate trust to identity providers: Google, GitHub, etc.