One of the problems I've also had with Snyk is low-quality duplicative entries (for example, cataloguing each deserialisation blacklist bypass in Jackson as a separate "new" vulnerability because "yay CVE numbers to put on CVs") which then wastes the time of folks triaging vulnerabilities who may have already concluded there's no exploitation risk (due to e.g. not deserialising user input, or not using polymorphic deserialisation anywhere) and have to review issues again.
On the alerting side, we have a couple of things coming. Neither are magic bullets, but both will help.
- Better handling of vulnerabilities in dev dependencies. Some vulnerabilities matter if they're in a dev dependency - anything that exfiltrates your local filesystem, for example. Other's don't - DoS vulnerabilities, for example. At the moment, GitHub doesn't even tell you whether the dependency a vulnerability affects is a runtime or development dependency. We can and will get better there.
- Analysis of whether the vulnerable code in a dependency is called. You almost certainly want to react faster to vulnerabilities in your code that your application is actually exposed to than to ones that it may be exposed to in future. (You probably want to respond to the unreachable ones, too, especially if you can get an auto-generated PR to do so, but there's much less urgency.) We have this in private beta for Python right now, and expect to have it in public beta in the next few months.
Beyond alerting, the other big thing is that GitHub's incentives for this database and the experiences it triggers are fundamentally different from other vendors. We aren't selling its contents, so don't have an incentive to inflate it. Open source maintainers are at the heart of our platform, and we really don't want low quality advisories go out about their software. And developers are our core customers, and we want to deliver experiences they love above all else. That difference in incentives will likely manifest in lots of little differences, but at a high level, we're aligned on wanting to reduce the alert fatigue.
Sorry we dropped the ball on this for the last couple of years. You're going to see steady improvements from here on.
https://www.tromzo.com/ - early but very strong vision
https://www.dazz.io/ - dumb name but decent vision
The more you do something the easier it is to do. There is nothing wrong with it no longer feeling like an alert. Patching security vulnerabilities is just a normal part of software development and the easier and more comfortable people are with it the better.
GHSA has moderate severity:
https://github.com/advisories/GHSA-896r-f27r-55mw
The CVSS3 score of the CVE is actually critical!!
If GHSA is "self-reporting" then why is it allowed to deviate in a direction that is harmful (downplaying the issue). If this means what I think it means (and I might be wrong) then the GHSA score is broken.
Also it breaks security workflows that build on GHSA: If a manager looking at the conflicting severity levels lowers the urgency of the backlog ticket because severity is only moderate then users might get hurt.
One thing to note is that the full CVSS 3.1 string is included in the database as assessed by NIST. The severity displayed by GitHub is stored as a "database specific" field, so it looks like we're trying to be explicit about the existence of multiple perspectives on severity (one of which is our own), but that we could do more to make that clear.
https://github.com/github/advisory-database/blob/main/adviso...
One big reason is that the alternative to this structured data being open source is that it lives in proprietary databases. In that world, attackers still have knowledge about these vulnerabilities - they don't need the structured data as much as defenders, and the licenses on those proprietary databases aren't going to deter them anyway (most are public for SEO reasons). Defenders on the other hand, often won't have as much or as high quality information.
The world is safer with this info in the public domain, will there be new exploits based on additional info? Sure, but that will get mitigated.
Software, like law or medicine is a practice, meaning we aren’t experts... we’re just learning better ways to do things.
This just opens the world to formal verification... for goodness sakes we’re just getting to fully reproducible deterministic software builds.
EDIT: Although your concerns might apply to unconfirmed public PRs
For repositories using a language the GitHub Dependency Graph supports, we automatically create an inventory of the dependencies the repository uses and create alerts if/when any have a vulnerability (via Dependabot alerts and, as a sibling comment has already mentioned, Dependabot update PRs).
The next improvement we'd like to ship is an API that lets you upload a list of dependencies to us for repositories in which we can't automatically detect them. A good example is repositories using Gradle for dependency management - it's hard for us to understand the dependency tree there without running a build. With the new API you'll be able to upload a list of dependencies (generated using a Gradle command) to GitHub in CI, and GitHub will then be able to send alerts if/when there's a vulnerability in one of those dependencies, just like we do for repos using other package managers.
Your comment specifically mentions containers. That's one area that's a little further off for native GitHub support, but where the open source advisory database should help. Whilst we're currently focussed on scanning source code and surfacing results on repos (not containers), the structured data in the advisory database is just as usable with the results of a container scan. Indeed, I believe all the open source container scanning solutions already use it as a data sources.
I worked at an i-bank that had their own version of Dependabot and it was great: New version(s) come out and once a week I get a PR to approve that shows that my code still passes tests after the update.
I'm mostly joking, although I do look at that immediately for any new repo because I'm starting to realize that the interest level of the project is directly related to the language(s) it uses.
Right now, our focus is on going deep for a smaller number of ecosystems before going broad. The intention is that anyone using one of the languages in the current list feels “fully covered” by the data in the database.
Also, are there plans to include data from before 2017?
On backfilling the data to include advisories from before 2017 - absolutely. So far we've done this in a relatively ad-hoc way - you should already find that the most important (severe and wide-reaching) CVEs from before 2017 are in the database (and if there are any that aren't you think should be we'd love you to open an issue on the DB). We want to do a more complete backfill in the near future.
https://github.com/github/advisory-database/blob/main/adviso...
On major strand is more work like this to make it easy for the community to collaborate. I expect we'll make a lot of iterative improvements to the database over the next few months, aimed at making it easier to contribute to, maintain and use. We need to improve our APIs for this data, for example (currently only available via GraphQL).
Another big one that we're starting to think about is the security vulnerability disclosure process. Our goal there is to support maintainers as much as possible, and there's more we can do. Recent articles on loguru, beg bounties, and the way log4j initially reached public attention all point to problems GitHub can and should help with. In the next 12 months we'd like to give maintainers the option to receive vulnerability disclosures privately on GitHub, and for us to be able to support them through that process. (GitHub already does a bit here - through maintainer security advisories we issued about 30% of the CVEs in the JavaScript ecosystem last year, for example. But we can and will do more.)
Loguru CVE article: https://tomforb.es/cve-2022-0329-and-the-problems-with-autom...
Beg bounties: https://www.troyhunt.com/beg-bounties/
Log4j PR: https://github.com/apache/logging-log4j2/pull/608#issuecomme...
For anything that doesn't already have a CVE, no. We don't want that disclosure process to happen in public - we recommend you reach out to the maintainer privately. (Currently we don't have an on-platform way to do that, but we're planning one.)
Edit: answered my own question - each GHSA in the repo has an `aliases` field and it seems that contains CVE; neat.
Thanks for sharing!
Will the team add more members to triage these things or bring upon better automations to ensure no exploitation happens through the process such as incentivizing trusted members of various ecosystems to help?
I love the idea of a public ledger using GitHub & PRs, but could more be done here to instill trust outside a single GitHub account? Perhaps even GitHub organizations could help out further of these known ecosystems.
With security advisories, it seems a bit worrying to see unreviewed advisories to yet be categorized or PRs be open for more than a few days with updated details.
We have some work to do on the tooling to make it really slick, and a couple of those PRs have taken longer to get reviewed than we'd like, but we're working on it!
On trusted members of language ecosystem - we'd be super interested to explore that. It will require some work on the tooling on our side, so I don't expect progress there overnight, but in the long term is a model I think we could make work really well.
https://security-tracker.debian.org/ https://security-team.debian.org/security_tracker.html