Tracking the Fake GitHub Star Black Market (opens in new tab)

(dagster.io)

489 pointskaeruct3y ago284 comments

284 comments

156 comments · 35 top-level

perihelions3y ago· 35 in thread

Goodhart's law: if you rely on a social signal to tell you what's good, you'll break that signal.

Very soon, the domain of bullshit will extend to actual text. We'll be able to buy HN comments by the thousand -- expertly wordsmithed, lucid AI comments -- and you can get them to say "this GitHub repo is the best", or "this startup is the real deal". Won't that be fun?

Alex39173y ago

> We'll be able to buy HN comments by the thousand -- expertly wordsmithed, lucid AI comments

You're forgetting the millions of additional comments that will be written by humans to trick the AI into promoting their content.

Even worse, currently if you ask Chat GPT to write you some code, it will make up an API endpoint that doesn't exist and then make up a URL that doesn't exist where you can register for an API key. People are already registering these domains, and parking fake sites on them to scam people. ChatGPT is creating a huge market for creating fake companies to match the fake information it's generating.

The biggest risk may not be people using AI-generated comments to promote their own repos, but rather registering new repos to match the fake ones that the AI is already promoting.

fantod3y ago

> ChatGPT is creating a huge market for creating fake companies to match the fake information it's generating.

Does ChatGPT consistently generate the same fake data though?

2 more replies

notabee3y ago

I'm constantly curious whether anyone working in the AI space is cognizant of the Tower of Babel myth.

I don't think an arms race for convincing looking bullshit is going to turn out well for our species.

permo-w3y ago

I feel like you’re overstating this as a long term issue. sure it’s a problem now, but realistically how long before code hallucinations are patched out?

5 more replies

klabb33y ago

Content based auto moderation has been shitty since it’s inception. I don’t like that GPT will cause the biggest flood of shit mankind has ever seen, but I am happy that it will kill these flawed ideas about policing.

The obvious problem is we don’t have any great alternatives. We have captcha, and we can look at behavior and source data (IP), and of course everyone’s favorite fingerprinting. To make matters worse: abuse, spam and fraud prevention lives in the same security-by-obscurity paradigm that cyber security lived in for decades before “we” collectively gave up on it, and decided that openness is better. People would laugh at you to suggest abuse tech should be open (“you’d just help the spammers”).

I tried to find whether academia has taken a stab at these problems but came up pretty much empty handed. Hopefully I’m just bad at searching. I truly don’t get why people aren’t looking at these issues seriously and systematically.

In the medium term, I’m worried that we’ll not address the systemic threats, and continue to throw ID checks, heuristics and ML at the wall, enjoying the short lived successes when some classifier works for a month before it’s defeated. The reason this is concerning is that we will be neck deep in crap (think SEO blogspam and recipe sites but for everything) which will be disorienting for long enough to erode a lot of trust that we could really use right now.

coldtea3y ago

>The obvious problem is we don’t have any great alternatives.

There's always identity based network of trust. Several other members vouch for new people to be included.

3 more replies

lifeisstillgood3y ago

I am unclear why a reasonable digital ID (probably government ID card style) plus rate limits is not going to be effective.

I can see lots of reaosns people might oppose the idea but I am not sure why it's not a widely discussed option?

(asking honestly and openly - please don't shout!)

6 more replies

Andrew_nenakhov3y ago

> The obvious problem is we don’t have any great alternatives.

Of course we do. The rise of digital finance services has led to creation of a number of servives that offer identity verification necessary for KYC. All such services offer APIs, so adding an identity verification requirement to your forum is trivial.

Of course, if it isn't obvious, I'm only half joking.

Nowado3y ago

You can do it already. It's a normal order for a copywriter, nobody will bat an eye when you post an offer. It costs cents/dollars per 1000 words instead of fraction of a cent, but that's not exactly outside of reach of a funded startup.

groestl3y ago

Next keyword: market of lemons. If you can't rely on said signals anymore, you must treat every item the same (untrusted), which drives out the legitimate players from the market. We have a lot of lemon markets, we can probably infer from them what the social result will be..

GlumWoodpecker3y ago

The scary part is that this doesn't seem too far off, with the current proliferation of large language models like the GPTs..

rzzzt3y ago

Parent was definitely not referring to these at all /s

2 more replies

siva73y ago

Who says this isn't already happening?

dang3y ago

If people see AI-generated comments on HN they should flag them and let us know at hn@ycombinator.com. HN is for humans to converse, and bots have never been allowed.

Of course it's not always easy to say what's AI-generated or not. But if an account is making a habit of it, it still seems possible to tell.

echelon3y ago

Reddit better hold their IPO soon or they'll get caught up in this. Pretty soon there will be dozens of different GPT/LLM-powered Reddit spam bots on Github. Some of them no doubt for political trolling. [1]

Phone, then ID-based verification is a stop gap, but IDV services will have to spin up to support the mass volume of verifying all humans.

[1] I kind of want to do this from an innocent / artistic perspective myself. Perhaps a bot that responds with a bunch of rhetorical questions or onomatopoeia. Then I'd scale it to the point people start noticing and feeling weirded out by it. "Is this the new Gen Alpha lingo?" Alas, I have too many other AI projects.

1 more reply

ChrisKnott3y ago

I just tried to find a FOSS tool for converting MS Outlook .pst file to .mbox.

I first tried Google; the results are dominating by commercial crap.

Then I tried the "google reddit" trick to try and find some real people's opinions... but look at all the blatantly bullshit comments on this Reddit thread; https://www.reddit.com/r/Thunderbird/comments/ae4cdg/good_ps...

---

(if anyone is wondering, the best option for Windows is to use 'readpst' command via WSL. Comes in the 'pst-utils' package).

2 more replies

vehemenz3y ago

Maybe more appropriately, Campbell's law:

"The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor."

precompute3y ago

Now is the time to cultivate friendships and to make networks that persist online, and are verified via irl meetups / contacts. People who pull that off now will be in much, much better shape in the future. GPT's output is apparent to a discernible eye right now, but according to the power law, it won't take much "novel" input to train upon to make that discernment useless. Then, the only internet community that could be dependably reliable would be your group of irl verified people.

password43213y ago

I would phrase it more as we're pretty much out of time to have initiated online-only relationships.

1 more reply

moneywoes3y ago

Best methods for that? Local meetups?

1 more reply

vidarh3y ago

We'll be back to the 1990's "software agents" craze take two: Needing AI driven agents that seek out and index and evaluate content on our behalf, and seek to negotiate with each other for recommendations with currency being trust based on how "your" agent evaluated prior results.

I'm hoping to put an AI between me and my e-mail inbox this weekend (I had ChatGPT write most of the code; it's not much); not fully automated, but evaluating and summarising and categorising. I might extend that to e.g. give me an "algorithm" for my Mastodon timeline (despite all of the people insisting on reverse chronological, I'm at a few hundred people I follow and already can't keep up), and a number of other sites I visit. For most of these things latency does not matter, so e.g. putting them through llama.cpp rather than something faster is fine, and precision isn't critical (I won't trust it to automatically reply or automatically reject anything, but prioritisation an categorisation where missteps won't have any critical impact.

soheil3y ago

Stop making up laws. You'll do much more good dismantling existing ones. And non-social signals like # of commits, # of pull requests cannot be faked? We need signals among the noise.

Sometimes signals are noise we just need to calibrate.

charlieyu13y ago

I hope it breaks the current system of requiring references in job search as well

paulcole3y ago

This system is already essentially broken. Either you worked at a large business that only gives out dates of employment and job title by policy or you are in complete control of who the hiring company talks to.

The first time you don’t get a job because of a reference you gave you learn a lesson. If it ever happens again, it’s on you.

1 more reply

wpietri3y ago

I mean, there have always been shills. What's changing now is the cost of shilling is dropping from dollars per comment to fractions of a cent. Troll farms used to be a lot of work to put together, but soon they'll be aaS.

Those of us who are careful internet readers have spent years developing good heuristics to use textual clues to tell us about the person behind the text. Are they smart? Are they sincere? Are they honest? Are they commenting in good faith? Those skills will soon be obsolete.

The folks at OpenAI, who are nominally on a mission to make sure AI "benefits all of humanity", have condemned us to a life sentence of fending off high-volume, high-quality bullshit. Bullshit that they are actively working to make harder to detect. And I think the first victims of that will be internet forums where text is the main signal, places like this and Reddit.

robertlagrant3y ago

Maybe we need a social network based on physical exchange of trust.

api3y ago

That’s mostly what the person to person phone system was.

iLoveOncall3y ago

> Very soon, the domain of bullshit will extend to actual text. We'll be able to buy HN comments by the thousand -- expertly wordsmithed, lucid AI comments -- and you can get them to say "this GitHub repo is the best", or "this startup is the real deal". Won't that be fun?

Definitely already the case, you really think Rust and SQLite would get more than a couple of upvotes otherwise? :D

wongarsu3y ago

Then how do you explain the Go hype HN went through just before the current rust hype? Where "[ordinary tool] in Go" was the formula for upvotes.

Then again, maybe Google had some mandatory HN time for their employees, that would be enough to explain that :D

dorian-graph3y ago

That's what Product Hunt has felt like for a long time—and LinkedIn too.

is_true3y ago

I'm sure it's already happening in the "books" threads

greesil3y ago

How do you know we aren't already there?

rwallace3y ago

This is the first time I've ever posted an XKCD link here, but I think the occasion calls for it.

https://xkcd.com/810/

einpoklum3y ago

Your comment is the best. It's the real deal!

ryan69howard3y ago

This comment summarizes it best. We need more discussion like this!

siva73y ago· 16 in thread

Is there even such a thing as a github influencer (people living just from github)?

supriyo-biswas3y ago

People working in DevRel often aggregate developer oriented content and gain popularity that way, an example would be "swyx" for example. I'm not taking a dump on his work, but you can see the Github influencer effect over there.

rozenmd3y ago

I didn't even know Shawn had a popular GitHub, though he has written about the meta-creator ceiling before: https://www.swyx.io/meta-creator-ceiling

1 more reply

wodenokoto3y ago

Never heard of swyx.

Self proclaimed GitHub star. But still only 5000 followers and projects max out at 8000 stars.

I don’t know what I had expected but I think it was bigger numbers than that.

https://github.com/sw-yx

2 more replies

pictur3y ago

There are very few people who work like this and are non-toxic.

hoofhearted3y ago

Taylor Otwell lol.. He has some pretty dope cars in his garage and is doing well.

I follow him on GitHub, and pay for some of his products. I have been heavily influenced by his coding styles, and the tools he uses. His code just looks so tight and perfect. He writes his stuff so open ended and reusable that he basically writes a method once, and then reuses it across numerous projects.

Look at this tight code: https://github.com/laravel/framework/blob/10.x/src/Illuminat...

I’d say that Adam Wathan is rapidly growing his influence as well, and is probably doing alright too.

jorgesborges3y ago

The multiple-line comment styling is so pleasingly pathological — each descending line has a few characters less than the last.

version_five3y ago

I think it would be tough (a good thing) because how often do people go to someone's root github page, even if they have a good repo? Not to say it never happens, but github is really about the repo, not the person (again a good thing) so it would be harder for an individual to become "influential". Hopefully nobody gets any ideas.

deefour3y ago

There are plenty of people making a living from donations to their open source contributions.

It seems odd to title them influencers based on that.

blitzar3y ago

I am going start posting linkedin influencer style "content" on my github for clout.

Hackbraten3y ago

Twenty pull requests every morning. That’s my plan for 2024.

reidjs3y ago

I have heard of people getting interviews from their GitHub profile.

ccouzens3y ago

I got my current job through GitHub.

At least that's how the 3rd party recruiter told me he found me. It's possible he was lying and thought it would impress me (it did).

My profile is more active than most, but very far from rockstar.

justinclift3y ago

Yeah. Several years ago extremely clueless recruiters used to email people heaps. Lots of people were complaining about getting tonnes of spam from them. :(

Had to change my Location (or some similar obvious field) in my GitHub profile to "Recruiters FUCK OFF" before they took the hint. ;)

Thankfully, GitHub introduced some other way to signal if you are/aren't interested in getting a job (toggle switch?) not long after, which seemed to work.

azu3y ago

https://press.stripe.com/working-in-public

The book presents similar stories.

PragmaticPulp3y ago

I’ve seen a number of resumes where people convey the popularity of their personal projects by number of stars or number of downloads.

bombolo3y ago

I guess the purpose is to find a job as evangelist and similar.

ziml773y ago· 13 in thread

I'm surprised that Github stars are valuable enough to buy. Personally I never look at the star count because even if they were legit, they don't really tell me anything more useful than I get from looking at other things in the repo.

I tend to check the age difference between the earliest and latest commits because that lets me be sure it's not a project that someone spent a couple weeks coding up, dropped on github, and then forgot about. I'll also check the issues on there. I'm looking for more closed issues than open ones, but I'll also quickly scan over them to get a rough idea of how many are truly meaningful issues. I also get signals from the readme and docs. It's not a hard pass if there's issues with those, but it's certainly helpful to my opinion if they exist and are both clear and detailed.

_xivi3y ago

Closed issues dont mean anything though... a lot of maintainers bulk close hundered of issues as "nofix", "no activity after 3 months", and so on. Just sweeping them under the rug. And many of them pride themselves with the 0 opened issues like it mean something. Any software in the world can have 0 issues if they played this game.

So unless you are really well versed in the project and spent some time following it, stars actually might be a better indicator of the project quality and reputation.

bakugo3y ago

> a lot of maintainers bulk close hundered of issues as "nofix", "no activity after 3 months", and so on

God, I hate this. Every time I have an issue with something, look it up on the issue tracker and find the exact issue I'm having autoclosed as "stale" by a fucking bot because the author didn't reply "this is still an issue" once every 24 hours, it instantly makes my blood boil and I avoid using the software in question as much as possible in the future. Nothing screams "I care more about github numbers than my users or the quality of my software" more than this.

1 more reply

cdiamand3y ago

I find stars helpful when I'm evaluating several different repos to choose a particular tool for a job.

If one of the repos has many more stars, I weigh that strongly when choosing. Freshness of commits is definitely important, but for me the fact that many other people starred the repo shows that there are eyeballs and activity.

Takennickname3y ago

You are likely not important enough to scam. The first people I can imagine this being shown to are VCs in pitch decks who are only going to see this on a powerpoint and not actually on github. Very unlikely the VC will check github itself to verify the number, and if they do, even less likely they'll verify that the stars are real.

You're the kind that checks everything. Even if you had something valuable, a scammer wouldn't waste their time with you then there are easier fish to bait.

ChancyChance3y ago

> dropped on github, and then forgot about.

I really wish GitHub would have some sort of flag for "stale" projects. I use your methods too (issues, dates, etc.), and I'm usually disappointed when search results bring up ghost projects. However, in a few instances, I found a project that was similar to an issue I was working on that went one step beyond where I was, and even though it was a ghost project, it helped. But in general, these projects don't help. I'm also disappointed that I'm thinking, "Hmmm, maybe LLMs can help..."

dylan6043y ago

Why is stale a bad thing? It could be something that was created to serve a purpose, developed to the point that it was feature complete for that purpose, and now requires no more development yet continues to do its purpose without modifications.

It's almost like you are thinking of it as an expiration date and the software has spoiled.

5 more replies

UncleEntity3y ago

I have one project on GitHub that I use all the time as part of a script and only push changes when the python API breaks it. It is essentially “finished” and usually just needs a quick compile against the new python version whenever I upgrade the distro. I haven’t even had to touch for at least as long as GitHub required ssh keys so by all accounts this would be an abandoned project.

Now that I think about it — it is a python wrapper around a boost library and neither of those have made backwards incompatible changes in a long time which is quite suspicious.

1 more reply

TylerLives3y ago

>I tend to check the age difference between the earliest and latest commits because that lets me be sure it's not a project that someone spent a couple weeks coding up

I doubt anyone would do this, but commit date can be arbitrarily changed.

A4ET8a8uTh03y ago

Interesting, I just use them to keep track of interesting projects ( edit: not the number of starts as a proxy; stars is basically my bookmark ). People treat them as internet points?

varunjain993y ago

Metrics based on issues / commit activity are certainly higher fidelity.

As you indicate though, they require more effort to adjudicate. Are issues from core team members? Are commits meaningful? Is community activity meaningful? I wish GitHub would give allow us to parse things like this more easily.

My use of star count is generally a binary indicator. 1k+ is probably a legit project and below is probably still early. Beyond that, it's probably too noisy.

version_five3y ago

I'll admit I've used them. In particular, I've used paperswithcode to find implementations of ML models. There are often a number of implementations of the same model, and the quality is highly variable. I've used stars (which paperswithcode displays) as a pre-screen. Spoiler alert, the highest started implementations are not always the best. But it still helps to triage, as a proxy for how well used it is

loeg3y ago

I mean based on the number of repos they identified buying stars and prices advertised, the revenue just doesn’t make sense. The sellers have made like, hundreds of dollars at most. How much effort have they invested for this meager return?

renewiltord3y ago

Displaying stars to represent traction in open source was a pitch deck phenomenon that was highly effective fitting the ZIRP.

coolsank3y ago· 8 in thread

Is it just me or the fact that Dagster has one of their competitors Mage.ai listed here as a repo with around 15% of fake stars seems like an odd coincidence?

bart_spoon3y ago

It’s possible that was the impetus of the blog post. Maybe they suspect Mage.ai of astroturfing GitHub stars and investigate it as above. They then publish a blog post that:

1. Indicates the astroturfing without actually specifically calling them out 2. Does so in a way where others can verify their work and use it on other repos 3. Uses their product to do so

Seems pretty brilliant to me.

janalsncm3y ago

If you’re going to accuse a competitor of fraud, writing a blog post showing your work seems like the most safe way to do it. People lie with statistics all the time of course.

TheDong3y ago

I mean, they explain it at the top:

> we track our own GitHub star count along with that of other projects. So when we spotted some new open-source projects suddenly racking up hundreds of stars a week, we were impressed. In some cases, it looked a bit too good to be true, and the patterns seemed off

If their competitor has fake-looking star counts, I'd expect them to be the ones best equipped and most likely to suspect it.

speedgoose3y ago

They don’t mention what I think is their biggest competitor: Prefect.

frasermarlow3y ago

[Blogpost author here] We ran the numbers for Prefect and several other repos in our space and they came out clean. As we note in the article, while some repos game the system, from what we can tell the number of abusers is actually fairly small.

1 more reply

say_it_as_it_is3y ago

It shouldn't be a surprise. Why are you surprised? Do you often pursue random activities irrelevant to your life for dozens of hours?

coolsank3y ago

Yes I do.

zeroonetwothree3y ago

Pretty standard for anyone ND

saurik3y ago· 8 in thread

> Yet [GitHub stars] influence serious, high stakes decisions, including which projects get used by enterprises, which startups get funded, and which companies talented professionals join.

Really? I honestly just don't believe this... if I were to believe this, I think I'd have to conclude the world is just too broken to bother rescuing.

philbo3y ago

One of my stock interview questions asks people how they evaluate 3rd-party dependencies for use in a production environment. So many interviewees respond with GitHub stars as their main or only criterion. It depresses me every time.

throw_away15253y ago

That's a very interesting question. There are so many things you can look at. How is the documentation? Who are the primary maintainers? How are they funded? What are their motivations? Are the primary maintainers active on Stack Overflow, Reddit, Discord, etc...? How many contributors are there? How does their Github issues page look? What about the Github discussion page? How many maintainers are there total? How many downloads per week on NPM (for JS libraries)? From all of these things - how long do you expect this library to be maintained? And that's just the initial qualification research, nothing about how it will impact the actual code-base.

What did I miss? What's the best answer you've ever heard? How do you evaluate 3rd party dependencies?

5 more replies

kaeructOP3y ago

What kind of answer would make you happy?

I prefer to look at the recent commits, or any recent activity on the repo's issues, but I would like to know what else can be used as an indicator.

1 more reply

tasuki3y ago

It depresses me too, but what else can you do? I check what the docs look like, but if I'm to depend on a thing I'd rather choose something popular than unpopular. GitHub stars, Hackage downloads, StackShare... what else can one check?

ZephyrBlu3y ago

People use flawed but easily consumable metrics to make almost all decisions.

It takes a lot more effort to collect multiple metrics along different axes, understand the skew/bias of them and make an informed decision.

Visibility and ease of consumption are the most important aspects of a metric if you want people to use it.

saurik3y ago

The list in the article, though, was carefully selected to presume competent people doing the decision-making. I totally believe many people use that star count for something... but an "enterprise"? someone investing non-trivial amount should of money? a specifically-"talented professional"? I just find that really difficult to believe. I've sold software to enterprise, I've worked with a number of venture capital funds, and I know a ton of actually-talented professionals... I dare say most of them consider GitHub's social features to be a joke.

The enterprises I deal with cared almost exclusively about stuff like license choices, support contract options, and "invoice billing" ;P. The vetting process I've dealt with at VCs was intense, having worked both sides of that situation; and I know multiple people who have worked data science jobs at such firms to try to better select investments. As for a "talented professional", I can pretty much guarantee they are going to look at your codebase, not the number of stars it has, while they evaluate any number of more reasonable things to judge an opportunity on (commute, pay, management style, etc.). A key property of competent deciders is that they aren't using trivial metrics.

rossmohax3y ago

More than once I've seen when number of stars was an argument to decide whether to pull dependency or write our own.

derivagral3y ago

Activity on other sites related to finance/coding is similar (seekingalpha likes, for example) and I've gotten organic inbound requests for work periodically scraping such info into... Excel.

penguin_booze3y ago· 6 in thread

My ex-employer used Github stars in their job description and during recruitement pitches. They regularly encouraged employees to go and star the firm's repos in Github. In all-hands meetings, the Github stars were one of the items they reported: "we've surpassed X in Github stars" (applause).

(The firm X, however, is a more well-known name than my ex-employer was).

A while ago, I listened to a Freakonomics episode where it was discussed that businesses use proxies to both boost their image and to cover up their incompetency. The example was that a lot of businesses chose fancy names starting with A (like, AAA plumbers), so that they get listed first in business directories. These firms were later proven to be very incompetent and/or even fraudulent.

The relevant paper, also cited in the episode, was "A Business by Any Other Name": https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1667550.

goodoldneon3y ago

They were incompetent because they didn’t have enough As. I exclusively use AAAAAAAAAAAA Plumbers

photochemsyn3y ago

Apparently seven is the sweet spot for visual recognition at a glance, so I'd go with AAAAAAA Plumbers instead.

1 more reply

lobocinza3y ago

The tech version of this is SaaS companies advertising on Reddit.

karakanb3y ago

do you mind elaborating on this? I am using Reddit to advertise some of my projects because it seems like a relevant crowd to advertise to, but I am curious to hear how it would be perceived.

1 more reply

moneywoes3y ago

Podcast episode name please

shagie3y ago

Not sure if this is it, but 552. Is Google Getting Worse has the 'AAA Plumbers' in it.

https://freakonomics.com/podcast/is-google-getting-worse/

1 more reply

debarshri3y ago· 5 in thread

While evaluting OSS project, key indicator is community activity. Github stars is a weak community activity indicator. Firstly, as shown in the article it can be gamed. Also, Stars is very low threshold action so does not indicate whether the person who starred the project will actually use it.

I think 2 great community activity indicators are - Github issues and of slack/discord/discourse comments. One key thing with github issues in my opinions is that, If the github issues are mostly by the core team, it is not a great sign. You want a large mix of issues from customers or users and not from the team. This is a good indicator if the project is solving real problem or not. Stars is very low threshold action. Same goes with the slack comments, it should have both volume and freshness.

jmclnx3y ago

I think checking if people donates to a project is a better indicator to the value of the project than the stars. I never paid attention to stars.

debarshri3y ago

I don't you can externally measure how much money is being donated for an OSS project can you?

boxed3y ago

Donating to yourself would be pretty cheap...

1 more reply

debarshri3y ago

But there are OSS projects that are VC backed. They don't take donations.

1 more reply

eternalban3y ago

Pretty sure those who game their repo are motivated by investment into associated startup. I think you are right that community activity is a high fiedlity indicator and a smart investor in OSS startups should definitely not only lurk in the community but if possible actually have resources to kick the project tires as well.

In a very strange way (but reflective of the economic regime) a startup that fakes stars vs a straight-arrow startup that doesn't is demonstrating a key element for success in business, which seems to require a significant element of bullshiting, and outright deceiving. The mantra has been that "grow grow grow" is the only guideline for success. Inflating your stars is just rookie hour practice for bigger better growth b.s. down the line.

bdcravens3y ago· 5 in thread

Sounds like they take it more serious than Google does likes on Youtube. A competitor had a video that rapidly had over 100k likes - but if you looked at the total time played, each view averaged to just a couple of seconds on a video over 10 minutes. Reported it, but nothing came of it. (No, not something we regularly do. I think it may be the only video I've ever reported; just want a fair playing field)

oefrha3y ago

> if you looked at the total time played, each view averaged to just a couple of seconds on a video over 10 minutes.

That makes no sense to me. Speaking as someone who has been using YouTube Data API v3 and YouTube Analytics API v2 for many years, estimated minutes watched of a video shouldn’t be public info. So how can you “look at the total time played” on a competitor’s video?

bdcravens3y ago

Been a few years; I don't recall the how. Maybe I'm thinking of a different platform?

dylan6043y ago

youtube competitor. that's just funny to me. kind of even comes across as petty. you took however much time to investigate average viewed time of a competitor and then cried to daddy about the perceived slight in "advantage" instead of taking that time to improve your competing product to make it better.

UncleEntity3y ago

Umm…

I think you have it backwards, the other video was using fake likes to avoid having to improve their quality to get an equal number of eyeballs.

1 more reply

bdcravens3y ago

No, we had someone show up out the blue, with no established presence in the space, with a video with hundreds of thousands of views. Was curious how they were so viral so fast.

Overall, it's bad for everyone if someone can create fraudulent views: us, other companies, and most importantly, consumers.

> taking that time to improve your competing product to make it better.

Took less than 3 minutes to do the math and send the report. I'm a fast developer, but I can't improve our product that fast :-)

tpoacher3y ago· 4 in thread

I have moved all my repositories to sourcehut. They are generally mirrored by a github repository consisting of a single README file explaining the new location for the project, and my reasons for the migration.

However, given sourcehut eschews the use such "social metrics" (which at some level I agree with the principle behind it, on the other hand I do appreciate the value of being able to give visibility to good projects) I usually mention in my README that "If you like the project and wish to promote it, feel free to star this github page".

I'm sure github probably wouldn't like this use-case, but the stars would certainly be genuine, even if possibly quite dodgy-looking.

wakeupcall3y ago

I have moved repositories off github, replaced the README with a warning and the new location and archived the project.

It's still getting starred...

leeoniya3y ago

> It's still getting starred...

clearly you did too good a job on the README

1 more reply

pbronez3y ago

I’m conflicted about this. Sourcehut, Codeberg, etc are great. But having everything I’m looking for on GitHub is extremely convenient. I use the “Add to List” function extensively for bookmarking.

tpoacher3y ago

Yes, this is why I didn't want to migrate without leaving a trace on github. The redirecting README on github is a good compromise, I think.

Having said that, it may be worth thinking what is the price we may be paying as a community for this convenience, btw. MS Github is clearly already past the "embrace" phase, and well into the "extend" phase.

1 more reply

yla923y ago· 4 in thread

TIL: you can buy (fake) GitHub stars.

That was a bit shocking to me to learn.

Springtime3y ago

After that post on HN months ago[1] where users discovered OAuth permissions for unrelated things being used/abused to star projects without their knowledge this news of buying stars didn't come as a surprise.

It's unfortunate as I've seen stars used as a metric of trustworthiness in general user discussions.

[1] https://news.ycombinator.com/item?id=33917962

astura3y ago

You can buy Twitter followers, Instagram followers, YouTube views, Amazon reviews, Reddit upvotes, Reddit comments, and Yelp reviews - so what's so shocking about GitHub stars?

quickthrower23y ago

/s ??

I always expected there was a market for fake stars. I am trying to get a repo naturally to 1000 stars, but I would never buy them.

sorokod3y ago

Can you explain why is it "natural" to try to get your repo to have many stars in a world where starts can be bought?

3 more replies

newmac3y ago· 3 in thread

It is worth noting that it is trivial to buy fake stars for a project you are not affiliated with. The reason someone might do this would be to "test" the purchasing of fake stars without risking contaminating their own project.

nvr2193y ago

I once bought a friend of mine a thousand Twitter followers as a prank. He wasn't happy.

i_am_toaster3y ago

As he should be, that wasn’t a well thought through prank.

1 more reply

moneywoes3y ago

Where did you purchase that?

1 more reply

thih93y ago· 2 in thread

> In spam detection, we often use heuristics in conjunction with machine learning to identify spammers.

Heuristics can only be used to identify suspected spammers. Not everyone who behaves like a spammer is a spammer, it could be e.g. a random user with privacy settings on, or someone who didn’t update their bio in a while and it got affected by link rot, etc.

Even if a group of low activity accounts stars the same projects, it could be that the account owners just discuss these projects elsewhere.

GlumWoodpecker3y ago

The article notes this, and like any spam detection method, it has a degree of false positives, but it seems very low (less than a percent according to the article). I'm sure an official implementation of this could take more internal, non-public factors into account, like IP addresses and clustering of account creation times, to make it even more accurate and drastically reduce the amount of spam users.

andreareina3y ago

The claim I saw in the article is 98% precision. Which doesn't actually tell us the predictive value without the base rate which seems to be all over the place.

woodruffw3y ago· 2 in thread

Things like this are part of why I cringe when I see supply chain analysis/security companies include “popularity” in their criticality metrics: the relationship between public popularity signals (like GitHub stars) and criticality is weak, at best.

andrewmcwatters3y ago

In my experience, it's actually a great signal. That's why so many people rely on it. The distribution of GitHub stars is an extreme power law.[1] Stargazer thresholds are used by maintainers to make decisions on including projects for different purposes from dependency management to package manager maintainers deciding to list software by name.[2]

[1]: https://github.com/andrewmcwattersandco/github-statistics

[2]: https://github.com/Homebrew/brew/blob/master/docs/Acceptable...

woodruffw3y ago

Selection suitability and criticality are different metrics. The former is what Homebrew uses, as a way to lessen maintainer load and prevent inclusion in Homebrew becoming its own quality signal. The latter is what I’ve seen supply chain companies provide: an implication that a project is somehow critical or essential to the overall ecosystem because it has so-and-so many stars.

That first use is not unreasonable, in my opinion. The second one is questionable, at best.

amsterdorn3y ago· 2 in thread

GitHub is fully aware of these, would they consider something like a "confirmed" star count that subtracts the suspicious/fake number? Or is that too much of a slippery slope.

mapmeld3y ago

GitHub gradually removes these users as they catch up to them, so not helpful to have extra steps. I have a couple of repos which were briefly popular, so when a new user stars it today, and I see 1000s of other stars, it's suspicious and I get a peek into their world.

There are obvious numeric usernames, but also fake orgs with repos for the users to fork and interact with, and a few account takeovers (i.e. someone had signed up for GitHub in 2015 to make a free wedding website, abandoned it, and the account fell into spammer hands). These used to be easier to report.

Azadzadeh3y ago

>GitHub gradually removes these users as they catch up to them

With collaterals too I presume [1]. I guess I've been the victim of some automated system. They have banned my account without warning or explanation and they've been ignoring my support tickets for about 2 months!

[1]: https://news.ycombinator.com/item?id=34817163

1 more reply

lessname3y ago· 2 in thread

How did you find out the name of the company behind GitHub24 though? If I go to their website I do cannot see it, I even cannot find anything if I search the company name.

gerogerke3y ago

I was also surprised when I saw it. A GbR is a German "Gesellschaft bürgerlichen Rechts" which does not need to be formally incorporated and offers no limited liability. The name needs to include the names of all partners, so we can deduce it is being run by two persons. I am quite surprised they do this without liability protection. Upon googling, I found only a playlist on YouTube which has this name and contains one explainer video about signing up a company with German tax authorities. If they are indeed based in Germany, they're required to have an Impressum / imprint on their home-page, without it, they risk being fined.

cyberia234243y ago

Perhaps they got it via payment info

NiloCK3y ago· 1 in thread

I have a half-written article about this, but I didn't have any good notion about quantifying the problem so this article is very welcome info to me.

My own angle is that copilot has shifted the incentives around this practice, maybe substantially. Businesses want to get (free tiers of) their paid SaaS endpoints into copilot suggestions - it's a great funnel!

I'd guess that github is as likely as not to become an SEO spam battlefield (like the rest of the web).

UncleEntity3y ago

> Businesses want to get (free tiers of) their paid SaaS endpoints into copilot suggestions - it's a great funnel!

That’s so brilliantly evil…

I can see the next generation of “how I got to $3m in passive income” articles being written (by ChatGPT) right now.

lozenge3y ago· 1 in thread

The projects with suspicious stars were still >80% nonfake stars. That to me suggests that most of the fake stars have been classified as nonfake. There isn't much psychological value in boosting your star count by just 25%.

bart_spoon3y ago

Depends on when the fake stars were created. If they are early in a projects life cycle, they may be used to get attention on the project, and once they have awareness, fake stars are no longer necessary.

JaDogg3y ago· 1 in thread

Just use Show HN & Reddit.

Ralfp3y ago

Those never worked for me.

Show HN: there are maybe dozens of those posted everyday but they rarely hit main page.

Reddit ad is great to kick off the star growth, but unless you have something interesting to many people, don’t expect more than 50 stars on first day and plateau to a star every few days.

Most GH stars I’ve got was from somebody mentioning my project in comment in some heated discussion on HN. So I guess drama sells?

PragmaticPulp3y ago· 1 in thread

> And if you enjoy this article, head on over to the Dagster repo and give us a real GitHub star!

Kind of ironic that they’re using blog articles and social media to pander for more stars on their GitHub project.

pythonguython3y ago

I wouldn’t describe that as ironic.

optimalsolver3y ago· 1 in thread

What's the street value?

robin_reala3y ago

It’s in the article.

sgammon3y ago· 1 in thread

this shouldn't be posted with links to the actual places to buy stars.... that seems like a bad idea?

lessname3y ago

Why? You can find these websites anyway if you search for terms like "buy github stars"

toastal3y ago

Maybe our code forges don't need to be social media platforms. These 'stars' have pretty dubious value and rarely correlate with code quality or importance (core libraries generally have less attention than apps or tools). There's also a heavy language skew where JavaScript and Python libraries & programs get way more thumbs-ups even when they're technically not any better than alternatives.

franciscop3y ago

I wrote on this topic a while ago; experimenting I found out you can basically change the repos names and keep the stars; this wouldn't work if you use the repo as issue tracker or PR tracker, since the history would all be broken, but if it's pretty much just the code it's easy to swap the star count between two repos:

https://francisco.io/blog/transferring-github-stars/

sacnoradhq3y ago

The next thing in social media vending machines.

https://twitter.com/Alexey__Kovalev/status/87184200877156761...

thewizl3y ago

As a note, GitHub stars are often used in pitch decks for OSS startups. VCs seem to care about that, judging from what I’ve seen around.

precompute3y ago

This sort of gamification exists only because there are too many green engineers that only care about their salaries, and they mimic what people successfully recruited by FAANG (etc.) did, and so do other companies. Then this purity spirals into taking the entire field down because there's no one around to educate the new newbies. Facebook was IMO a step in the right direction because it was a "general" social network, you could post anything. Imagine if FB had released some sort of an "extension" that allowed you to share anything via a template of sorts, instead of having to type out everything in the same old text post. It would have been meta enough (sorry) to not spiral very quickly.

Leaving the arena is the only viable option. Software projects that aren't dependent on github drive their own vehicle, everyone else is on a crowded bus.

Der_Einzige3y ago

I wrote a tiny tool which calculates the "brightness" score of a github repo based on calculating the total star count of the people who starred your repo. It will automatically detect these kinds of scams (assuming that it's mostly low star bots giving the stars).

https://github.com/Hellisotherpeople/Bright

Edit: I love clustering, I really do, but I think that techniques like the one I am using are far superior to unsupervised learning for trying to detect fake accounts in this context.

rootsudo3y ago

This is a great article, I've developed the same tactics for other projects but never was able to graft the proper vernacular. It really helps tackling how to organize and present information.

I wonder if this is also in general OSINT or ISC^2 training - everything this article showed for breadtrails and reverse operation (e.g. pay a company to do the work, see how it is, evaluate the results, see if you can find other work similar/akin to it.)

Xeoncross3y ago

Rabbit trail: I accidentally right-clicked on their home icon and it brought up their branding page with license agreements for their IP. Really neat idea.

newmac3y ago

I think the most interesting thing would be to run this test against the list of Launch HNs, sorted by votes, grouped by class.

malshe3y ago

I give Github star as a bookmark for the repo so I assumed that others might be using it the same way too.

_8j503y ago

I didn't knoe people used stars to make decisions. For me it is more like HN karma points. I use their issue history/pr history to get an idea of how good or bad a project is

dnchdnd3y ago

only vaguely related - but I've been recently trying out dagster and I'm pretty impressed so far. I've run large scale data-processing from Hadoop onwards and was expecting the usual crumminess whenever you hit and edge case.

Instead I found a system that seems to be thoughtfully designed and, crucially, easy to debug.

erlend_sh3y ago

Great post, though I was low-key hoping for a top 10 or maybe top 100 ranking of most starred juiced-up repos.

Kalanos3y ago

do streamlit

j / k navigate · click thread line to collapse

284 comments

156 comments · 35 top-level

perihelions3y ago· 35 in thread

Goodhart's law: if you rely on a social signal to tell you what's good, you'll break that signal.

Alex39173y ago

> We'll be able to buy HN comments by the thousand -- expertly wordsmithed, lucid AI comments

You're forgetting the millions of additional comments that will be written by humans to trick the AI into promoting their content.

The biggest risk may not be people using AI-generated comments to promote their own repos, but rather registering new repos to match the fake ones that the AI is already promoting.

fantod3y ago

> ChatGPT is creating a huge market for creating fake companies to match the fake information it's generating.

Does ChatGPT consistently generate the same fake data though?

2 more replies

notabee3y ago

I'm constantly curious whether anyone working in the AI space is cognizant of the Tower of Babel myth.

I don't think an arms race for convincing looking bullshit is going to turn out well for our species.

permo-w3y ago

I feel like you’re overstating this as a long term issue. sure it’s a problem now, but realistically how long before code hallucinations are patched out?

5 more replies

klabb33y ago

coldtea3y ago

>The obvious problem is we don’t have any great alternatives.

There's always identity based network of trust. Several other members vouch for new people to be included.

3 more replies

lifeisstillgood3y ago

I am unclear why a reasonable digital ID (probably government ID card style) plus rate limits is not going to be effective.

I can see lots of reaosns people might oppose the idea but I am not sure why it's not a widely discussed option?

(asking honestly and openly - please don't shout!)

6 more replies

Andrew_nenakhov3y ago

> The obvious problem is we don’t have any great alternatives.

Of course, if it isn't obvious, I'm only half joking.

Nowado3y ago

groestl3y ago

GlumWoodpecker3y ago

The scary part is that this doesn't seem too far off, with the current proliferation of large language models like the GPTs..

rzzzt3y ago

Parent was definitely not referring to these at all /s

2 more replies

siva73y ago

Who says this isn't already happening?

dang3y ago

If people see AI-generated comments on HN they should flag them and let us know at hn@ycombinator.com. HN is for humans to converse, and bots have never been allowed.

Of course it's not always easy to say what's AI-generated or not. But if an account is making a habit of it, it still seems possible to tell.

echelon3y ago

Phone, then ID-based verification is a stop gap, but IDV services will have to spin up to support the mass volume of verifying all humans.

1 more reply

ChrisKnott3y ago

I just tried to find a FOSS tool for converting MS Outlook .pst file to .mbox.

I first tried Google; the results are dominating by commercial crap.

---

(if anyone is wondering, the best option for Windows is to use 'readpst' command via WSL. Comes in the 'pst-utils' package).

2 more replies

vehemenz3y ago

Maybe more appropriately, Campbell's law:

precompute3y ago

password43213y ago

I would phrase it more as we're pretty much out of time to have initiated online-only relationships.

1 more reply

moneywoes3y ago

Best methods for that? Local meetups?

1 more reply

vidarh3y ago

soheil3y ago

Stop making up laws. You'll do much more good dismantling existing ones. And non-social signals like # of commits, # of pull requests cannot be faked? We need signals among the noise.

Sometimes signals are noise we just need to calibrate.

charlieyu13y ago

I hope it breaks the current system of requiring references in job search as well

paulcole3y ago

The first time you don’t get a job because of a reference you gave you learn a lesson. If it ever happens again, it’s on you.

1 more reply

wpietri3y ago

robertlagrant3y ago

Maybe we need a social network based on physical exchange of trust.

api3y ago

That’s mostly what the person to person phone system was.

iLoveOncall3y ago

Definitely already the case, you really think Rust and SQLite would get more than a couple of upvotes otherwise? :D

wongarsu3y ago

Then how do you explain the Go hype HN went through just before the current rust hype? Where "[ordinary tool] in Go" was the formula for upvotes.

Then again, maybe Google had some mandatory HN time for their employees, that would be enough to explain that :D

dorian-graph3y ago

That's what Product Hunt has felt like for a long time—and LinkedIn too.

is_true3y ago

I'm sure it's already happening in the "books" threads

greesil3y ago

How do you know we aren't already there?

rwallace3y ago

This is the first time I've ever posted an XKCD link here, but I think the occasion calls for it.

https://xkcd.com/810/

einpoklum3y ago

Your comment is the best. It's the real deal!

ryan69howard3y ago

This comment summarizes it best. We need more discussion like this!

siva73y ago· 16 in thread

Is there even such a thing as a github influencer (people living just from github)?

supriyo-biswas3y ago

rozenmd3y ago

I didn't even know Shawn had a popular GitHub, though he has written about the meta-creator ceiling before: https://www.swyx.io/meta-creator-ceiling

1 more reply

wodenokoto3y ago

Never heard of swyx.

Self proclaimed GitHub star. But still only 5000 followers and projects max out at 8000 stars.

I don’t know what I had expected but I think it was bigger numbers than that.

https://github.com/sw-yx

2 more replies

pictur3y ago

There are very few people who work like this and are non-toxic.

hoofhearted3y ago

Taylor Otwell lol.. He has some pretty dope cars in his garage and is doing well.

Look at this tight code: https://github.com/laravel/framework/blob/10.x/src/Illuminat...

I’d say that Adam Wathan is rapidly growing his influence as well, and is probably doing alright too.

jorgesborges3y ago

The multiple-line comment styling is so pleasingly pathological — each descending line has a few characters less than the last.

version_five3y ago

deefour3y ago

There are plenty of people making a living from donations to their open source contributions.

It seems odd to title them influencers based on that.

blitzar3y ago

I am going start posting linkedin influencer style "content" on my github for clout.

Hackbraten3y ago

Twenty pull requests every morning. That’s my plan for 2024.

reidjs3y ago

I have heard of people getting interviews from their GitHub profile.

ccouzens3y ago

I got my current job through GitHub.

At least that's how the 3rd party recruiter told me he found me. It's possible he was lying and thought it would impress me (it did).

My profile is more active than most, but very far from rockstar.

justinclift3y ago

Yeah. Several years ago extremely clueless recruiters used to email people heaps. Lots of people were complaining about getting tonnes of spam from them. :(

Had to change my Location (or some similar obvious field) in my GitHub profile to "Recruiters FUCK OFF" before they took the hint. ;)

Thankfully, GitHub introduced some other way to signal if you are/aren't interested in getting a job (toggle switch?) not long after, which seemed to work.

azu3y ago

https://press.stripe.com/working-in-public

The book presents similar stories.

PragmaticPulp3y ago

I’ve seen a number of resumes where people convey the popularity of their personal projects by number of stars or number of downloads.

bombolo3y ago

I guess the purpose is to find a job as evangelist and similar.

ziml773y ago· 13 in thread

_xivi3y ago

So unless you are really well versed in the project and spent some time following it, stars actually might be a better indicator of the project quality and reputation.

bakugo3y ago

> a lot of maintainers bulk close hundered of issues as "nofix", "no activity after 3 months", and so on

1 more reply

cdiamand3y ago

I find stars helpful when I'm evaluating several different repos to choose a particular tool for a job.

Takennickname3y ago

You're the kind that checks everything. Even if you had something valuable, a scammer wouldn't waste their time with you then there are easier fish to bait.

ChancyChance3y ago

> dropped on github, and then forgot about.

dylan6043y ago

It's almost like you are thinking of it as an expiration date and the software has spoiled.

5 more replies

UncleEntity3y ago

Now that I think about it — it is a python wrapper around a boost library and neither of those have made backwards incompatible changes in a long time which is quite suspicious.

1 more reply

TylerLives3y ago

>I tend to check the age difference between the earliest and latest commits because that lets me be sure it's not a project that someone spent a couple weeks coding up

I doubt anyone would do this, but commit date can be arbitrarily changed.

A4ET8a8uTh03y ago

Interesting, I just use them to keep track of interesting projects ( edit: not the number of starts as a proxy; stars is basically my bookmark ). People treat them as internet points?

varunjain993y ago

Metrics based on issues / commit activity are certainly higher fidelity.

My use of star count is generally a binary indicator. 1k+ is probably a legit project and below is probably still early. Beyond that, it's probably too noisy.

version_five3y ago

loeg3y ago

renewiltord3y ago

Displaying stars to represent traction in open source was a pitch deck phenomenon that was highly effective fitting the ZIRP.

coolsank3y ago· 8 in thread

Is it just me or the fact that Dagster has one of their competitors Mage.ai listed here as a repo with around 15% of fake stars seems like an odd coincidence?

bart_spoon3y ago

It’s possible that was the impetus of the blog post. Maybe they suspect Mage.ai of astroturfing GitHub stars and investigate it as above. They then publish a blog post that:

1. Indicates the astroturfing without actually specifically calling them out 2. Does so in a way where others can verify their work and use it on other repos 3. Uses their product to do so

Seems pretty brilliant to me.

janalsncm3y ago

If you’re going to accuse a competitor of fraud, writing a blog post showing your work seems like the most safe way to do it. People lie with statistics all the time of course.

TheDong3y ago

I mean, they explain it at the top:

If their competitor has fake-looking star counts, I'd expect them to be the ones best equipped and most likely to suspect it.

speedgoose3y ago

They don’t mention what I think is their biggest competitor: Prefect.

frasermarlow3y ago

1 more reply

say_it_as_it_is3y ago

It shouldn't be a surprise. Why are you surprised? Do you often pursue random activities irrelevant to your life for dozens of hours?

coolsank3y ago

Yes I do.

zeroonetwothree3y ago

Pretty standard for anyone ND

saurik3y ago· 8 in thread

> Yet [GitHub stars] influence serious, high stakes decisions, including which projects get used by enterprises, which startups get funded, and which companies talented professionals join.

Really? I honestly just don't believe this... if I were to believe this, I think I'd have to conclude the world is just too broken to bother rescuing.

philbo3y ago

throw_away15253y ago

What did I miss? What's the best answer you've ever heard? How do you evaluate 3rd party dependencies?

5 more replies

kaeructOP3y ago

What kind of answer would make you happy?

I prefer to look at the recent commits, or any recent activity on the repo's issues, but I would like to know what else can be used as an indicator.

1 more reply

tasuki3y ago

ZephyrBlu3y ago

People use flawed but easily consumable metrics to make almost all decisions.

It takes a lot more effort to collect multiple metrics along different axes, understand the skew/bias of them and make an informed decision.

Visibility and ease of consumption are the most important aspects of a metric if you want people to use it.

saurik3y ago

rossmohax3y ago

More than once I've seen when number of stars was an argument to decide whether to pull dependency or write our own.

derivagral3y ago

Activity on other sites related to finance/coding is similar (seekingalpha likes, for example) and I've gotten organic inbound requests for work periodically scraping such info into... Excel.

penguin_booze3y ago· 6 in thread

(The firm X, however, is a more well-known name than my ex-employer was).

The relevant paper, also cited in the episode, was "A Business by Any Other Name": https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1667550.

goodoldneon3y ago

They were incompetent because they didn’t have enough As. I exclusively use AAAAAAAAAAAA Plumbers

photochemsyn3y ago

Apparently seven is the sweet spot for visual recognition at a glance, so I'd go with AAAAAAA Plumbers instead.

1 more reply

lobocinza3y ago

The tech version of this is SaaS companies advertising on Reddit.

karakanb3y ago

do you mind elaborating on this? I am using Reddit to advertise some of my projects because it seems like a relevant crowd to advertise to, but I am curious to hear how it would be perceived.

1 more reply

moneywoes3y ago

Podcast episode name please

shagie3y ago

Not sure if this is it, but 552. Is Google Getting Worse has the 'AAA Plumbers' in it.

https://freakonomics.com/podcast/is-google-getting-worse/

1 more reply

debarshri3y ago· 5 in thread

jmclnx3y ago

I think checking if people donates to a project is a better indicator to the value of the project than the stars. I never paid attention to stars.

debarshri3y ago

I don't you can externally measure how much money is being donated for an OSS project can you?

boxed3y ago

Donating to yourself would be pretty cheap...

1 more reply

debarshri3y ago

But there are OSS projects that are VC backed. They don't take donations.

1 more reply

eternalban3y ago

bdcravens3y ago· 5 in thread

oefrha3y ago

> if you looked at the total time played, each view averaged to just a couple of seconds on a video over 10 minutes.

bdcravens3y ago

Been a few years; I don't recall the how. Maybe I'm thinking of a different platform?

dylan6043y ago

UncleEntity3y ago

Umm…

I think you have it backwards, the other video was using fake likes to avoid having to improve their quality to get an equal number of eyeballs.

1 more reply

bdcravens3y ago

No, we had someone show up out the blue, with no established presence in the space, with a video with hundreds of thousands of views. Was curious how they were so viral so fast.

Overall, it's bad for everyone if someone can create fraudulent views: us, other companies, and most importantly, consumers.

> taking that time to improve your competing product to make it better.

Took less than 3 minutes to do the math and send the report. I'm a fast developer, but I can't improve our product that fast :-)

tpoacher3y ago· 4 in thread

I'm sure github probably wouldn't like this use-case, but the stars would certainly be genuine, even if possibly quite dodgy-looking.

wakeupcall3y ago

I have moved repositories off github, replaced the README with a warning and the new location and archived the project.

It's still getting starred...

leeoniya3y ago

> It's still getting starred...

clearly you did too good a job on the README

1 more reply

pbronez3y ago

tpoacher3y ago

Yes, this is why I didn't want to migrate without leaving a trace on github. The redirecting README on github is a good compromise, I think.

1 more reply

yla923y ago· 4 in thread

TIL: you can buy (fake) GitHub stars.

That was a bit shocking to me to learn.

Springtime3y ago

It's unfortunate as I've seen stars used as a metric of trustworthiness in general user discussions.

[1] https://news.ycombinator.com/item?id=33917962

astura3y ago

You can buy Twitter followers, Instagram followers, YouTube views, Amazon reviews, Reddit upvotes, Reddit comments, and Yelp reviews - so what's so shocking about GitHub stars?

quickthrower23y ago

/s ??

I always expected there was a market for fake stars. I am trying to get a repo naturally to 1000 stars, but I would never buy them.

sorokod3y ago

Can you explain why is it "natural" to try to get your repo to have many stars in a world where starts can be bought?

3 more replies

newmac3y ago· 3 in thread

nvr2193y ago

I once bought a friend of mine a thousand Twitter followers as a prank. He wasn't happy.

i_am_toaster3y ago

As he should be, that wasn’t a well thought through prank.

1 more reply

moneywoes3y ago

Where did you purchase that?

1 more reply

thih93y ago· 2 in thread

> In spam detection, we often use heuristics in conjunction with machine learning to identify spammers.

Even if a group of low activity accounts stars the same projects, it could be that the account owners just discuss these projects elsewhere.

GlumWoodpecker3y ago

andreareina3y ago

The claim I saw in the article is 98% precision. Which doesn't actually tell us the predictive value without the base rate which seems to be all over the place.

woodruffw3y ago· 2 in thread

andrewmcwatters3y ago

[1]: https://github.com/andrewmcwattersandco/github-statistics

[2]: https://github.com/Homebrew/brew/blob/master/docs/Acceptable...

woodruffw3y ago

That first use is not unreasonable, in my opinion. The second one is questionable, at best.

amsterdorn3y ago· 2 in thread

GitHub is fully aware of these, would they consider something like a "confirmed" star count that subtracts the suspicious/fake number? Or is that too much of a slippery slope.

mapmeld3y ago

Azadzadeh3y ago

>GitHub gradually removes these users as they catch up to them

[1]: https://news.ycombinator.com/item?id=34817163

1 more reply

lessname3y ago· 2 in thread

How did you find out the name of the company behind GitHub24 though? If I go to their website I do cannot see it, I even cannot find anything if I search the company name.

gerogerke3y ago

cyberia234243y ago

Perhaps they got it via payment info

NiloCK3y ago· 1 in thread

I have a half-written article about this, but I didn't have any good notion about quantifying the problem so this article is very welcome info to me.

I'd guess that github is as likely as not to become an SEO spam battlefield (like the rest of the web).

UncleEntity3y ago

> Businesses want to get (free tiers of) their paid SaaS endpoints into copilot suggestions - it's a great funnel!

That’s so brilliantly evil…

I can see the next generation of “how I got to $3m in passive income” articles being written (by ChatGPT) right now.

lozenge3y ago· 1 in thread

bart_spoon3y ago

JaDogg3y ago· 1 in thread

Just use Show HN & Reddit.

Ralfp3y ago

Those never worked for me.

Show HN: there are maybe dozens of those posted everyday but they rarely hit main page.

Reddit ad is great to kick off the star growth, but unless you have something interesting to many people, don’t expect more than 50 stars on first day and plateau to a star every few days.

Most GH stars I’ve got was from somebody mentioning my project in comment in some heated discussion on HN. So I guess drama sells?

PragmaticPulp3y ago· 1 in thread

> And if you enjoy this article, head on over to the Dagster repo and give us a real GitHub star!

Kind of ironic that they’re using blog articles and social media to pander for more stars on their GitHub project.

pythonguython3y ago

I wouldn’t describe that as ironic.

optimalsolver3y ago· 1 in thread

What's the street value?

robin_reala3y ago

It’s in the article.

sgammon3y ago· 1 in thread

this shouldn't be posted with links to the actual places to buy stars.... that seems like a bad idea?

lessname3y ago

Why? You can find these websites anyway if you search for terms like "buy github stars"

toastal3y ago

franciscop3y ago

https://francisco.io/blog/transferring-github-stars/

sacnoradhq3y ago

The next thing in social media vending machines.

https://twitter.com/Alexey__Kovalev/status/87184200877156761...

thewizl3y ago

As a note, GitHub stars are often used in pitch decks for OSS startups. VCs seem to care about that, judging from what I’ve seen around.

precompute3y ago

Leaving the arena is the only viable option. Software projects that aren't dependent on github drive their own vehicle, everyone else is on a crowded bus.

Der_Einzige3y ago

https://github.com/Hellisotherpeople/Bright

Edit: I love clustering, I really do, but I think that techniques like the one I am using are far superior to unsupervised learning for trying to detect fake accounts in this context.

rootsudo3y ago

This is a great article, I've developed the same tactics for other projects but never was able to graft the proper vernacular. It really helps tackling how to organize and present information.

Xeoncross3y ago

Rabbit trail: I accidentally right-clicked on their home icon and it brought up their branding page with license agreements for their IP. Really neat idea.

newmac3y ago

I think the most interesting thing would be to run this test against the list of Launch HNs, sorted by votes, grouped by class.

malshe3y ago

I give Github star as a bookmark for the repo so I assumed that others might be using it the same way too.

_8j503y ago

I didn't knoe people used stars to make decisions. For me it is more like HN karma points. I use their issue history/pr history to get an idea of how good or bad a project is

dnchdnd3y ago

Instead I found a system that seems to be thoughtfully designed and, crucially, easy to debug.

erlend_sh3y ago

Great post, though I was low-key hoping for a top 10 or maybe top 100 ranking of most starred juiced-up repos.

Kalanos3y ago

do streamlit

j / k navigate · click thread line to collapse