SELECT * FROM `bigquery-public-data.pypi.distribution_metadata` ORDER BY length(version) DESC LIMIT 10
The winner is: https://pypi.org/project/elvisgogo/#historyThe package with most versions still listed on PyPI is spanishconjugator [2], which consistently published ~240 releases per month between 2020 and 2024.
[1] https://console.cloud.google.com/bigquery?p=bigquery-public-...
Prior to that commit, a cronjob would run the 'bumpVersion.yml' workflow four times a day, which in turn executes the bump2version python module to increase the patch level. [0]
Edit: discussed here: https://github.com/Benedict-Carling/spanish-conjugator/issue...
[0] https://github.com/Benedict-Carling/spanish-conjugator/commi...
The underlying dataset is hosted at sql.clickhouse.com e.g. https://sql.clickhouse.com/?query=U0VMRUNUIGNvdW50KCkgICBGUk...
disclaimer: built this a a while ago but we maintain this at clickhouse
oh and rubygems data is also there.
[0] https://sql.clickhouse.com?query=U0VMRUNUIHByb2plY3QsIE1BWCh...
[1] Quota read limit exceeded. Results may be incomplete.
They also stopped updating major and minor versions after hitting 2.3 in Sept 2020. Would be interesting to hear the rationale behind the versioning strategy. Feels like you might as well use a datetimestamp for the version.
It seems this was their first time going down this rabbit hole, so for them and anyone else, I'd urge you to use the deps.dev Google BigQuery dataset [0] for this kind of analysis. It does indeed include NPM and would have made the author's work trivial.
Here's a gist with the query and the results https://gist.github.com/jonchurch/9f9283e77b4937c8879448582b...
This is insane
> This is insane
Not for the JavaScript world.I hate to deride the entire community, but many of the collective community decisions are smells. I think that the low barrier to entry means that the community has many inexperienced influential people.
The bar for entry was always low with javascript, but it also used to be a lot more sane when it was a publicly-driven language.
DiffEqBase 6.189.1
LoopVectorization 0.12.172
Reactant 0.2.161
Mooncake 0.4.159
Distributions 0.25.120
So, no crazy numbers or random unknown packages, all are major packages that have just had a lot of work and history to them. Out of the top 10, pretty much half were from the SciML ecosystem.Caveats/constraints: Like the post, this ignores non-SemVer packages (which mostly used date-based versions) and also jll (binary wrapper) packages which just use their underlying C libraries' versions. Among jlls, the largest that isn't a date afaict is NEO_jll with 25.31.34666+0 as its version.
https://www.npmjs.com/package/@types/chrome?activeTab=versio...
I wonder why. Conventions that are being broken, maybe.
If the guy writing and maintaining the software is stating "this software is not stable yet" then who am I to disagree?
I made a fairly significant (dumb) mistake in the logic for extracting valid semver versions. I was doing a falsy check, so if any of major/minor/patch in the version was a 0, the whole package was ignored.
The post has been updated to reflect this.
1- Spam 2- Scam 3- Avoid paying for Whatsapp API (which is the only form of monetization)
And that the reason this thing gets so many updates is probably because of a mouse and cat game where Meta updates their software continuously to avoid these types of hacks and the maintainers do so as well, whether in automated or manual fashion.
The author could improve the batching in fetchAllPackageData by not waiting for all 50 (BATCH_SIZE) promises to resolve at once. I just published a package for proper promise batching last week: https://www.npmjs.com/package/promises-batched
Just spin up a loop of 50 call chains. When one completes you just do the next on next tick. It's like 3 lines of code. No libraries needed. Then you're always doing 50 at a time. You can still use await.
async work() { await thing(); nextTick(work); }
for(to 50) { work(); }
then maybe a separate timer to check how many tasks are active I guess.
The implementation is rather simple, but more than 3 LoC: https://github.com/whilenot-dev/promises-batched/blob/main/s...
Couldn't find any specific rate limit numbers besides the one mentioned here[0] from 2019:
> Up to five million requests to the registry per month are considered acceptable at this time
[0]: https://blog.npmjs.org/post/187698412060/acceptible-use.html
It also isn’t the first AWS SDK. A few of us in… 2012 IIRC… wrote the first one because AWS didn’t think node was worth an SDK.
> carrot-scan -> 27708 total versions
> Command-line tool for detecting vulnerabilities in files and directories.
I can't help but feel there is something absurd about this.
Any package version that didn't follow the x.y.z format was excluded, and any package that had less published versions than their largest version number was excluded (e.g. a package at version 1.123.0 should have at least 123 published versions)
If this was an actual measurement of productivity that bot deserves a raise!
Bigliest, boomiest version is 3735928560 from https://metacpan.org/dist/Acme-Boom
But what if it was "all-the-package-names-that-do-not-reference-themselves"?
AWS still made the top 50