YouTube-DL Gitlab Backup Repository (opens in new tab)

(gitlab.com)

226 pointsjoaogfarias5y ago36 comments

36 comments

Gitlab is not immune to DMCA notices either since it's a US company. The RIAA probably won't try to take it down unless the youtube-dl development is actually relocated there or they become aware of it somehow. You'd have to host it in a country where DMCA can safely be ignored (anonymously of course), e.g. the Netherlands or Russia, if you want it to stay up reliably.

INTPenis5y ago

So true.

But to me as a long time Gitlab fan this shows the power of using Gitlab. Because if they were using Gitlab when the DMCA hits they could have easily migrated to their own Gitlab instance.

While Github would have to export things into another format. I'm sure people have worked on exporting data from Github to Gitlab but it seems much easier to just use Gitlab.

Edit: And of course besides data, you can't migrate contributors as easy. Which is also where Gitlab is powerful because it's open source and people are actually discussing federation in it. I doubt Github will ever get any kind of federation outside of microsoft services.

qwertox5y ago

> could have easily migrated to their own Gitlab instance

Not sure if I would do that. Wouldn't that make me directly a target for the companies issuing the DMCA?

INTPenis5y ago

You're always a target of DMCA takedown requests, but having the freedom to host yourself anywhere with federation to other instances would give you the freedom to ignore it.

corobo5y ago

You're already a direct target of the DMCA if you control the code. They go through GitHub for ease.

Yes, if you host this anywhere without safe harbour protection you're going to get the DMCA takedown directly

input_sh5y ago

Gitee.com is a popular Chinese alternative. I'm pretty sure it'd be safe there.

yorwba5y ago

China has extremely strict anti-scraping laws [1] so pointing out that youtube-dl is a tool that can be used to illegally scrape copyrighted data is likely to have the same effect as a DMCA takedown request with the same justification.

[1] a collection of cases: https://github.com/HiddenStrawberry/Crawler_Illegal_Cases_In...

oefrha5y ago

> China has extremely strict anti-scraping laws

It might be the case, but the repo you linked doesn't support that claim very well, and the cases cited are largely irrelevant to the case at hand.

"Forbidden area #1: providing scraping-related services to criminal organizations". Three cases listed. The first is one programmer's personal account of being arrested, which is very scant on why; the only info I can glean: "I developed some sort of ML API which is then used by a criminal enterprise against some influential company for god knows what purpose". Hard to draw any conclusion from that. The second case is ML-based CAPTCHA bypass for credential stuffing against Tencent QQ, emphasis on credential stuffing. The third case is some sort of black hat SEO campaign against Baidu, the scraping part (if any?) doesn't seem central to the conviction.

"Forbidden area #2: scraping and sale of personal info". Common sense, irrelevant.

"Forbidden area #3: commercial use of unlicensed business data" and the following untitled category list three cases, all of which are mass scraping operations either from a business competitor or that seriously affects site operations (through aggressive scraping).

AFAIK there are a lot of low hanging fruits in the Chinese piracy scene not yet targeted, and there are enough small-time commercial operations involving copyrighted media products begging to be taken down, it's highly unlikely anyone will bother to target some high-barrier-of-entry tool mostly facilitating the download of otherwise public videos.

1 more reply

stevefan19995y ago

I don't think so. Youtube-DL can rip some popular Chinese VOD websites such as Bilibili and iQIYI, thus will first be "subjectable" to the Chinese equivalent of DMCA.

Worse, these companies are often backed/protected by powerful figures (often under the umbrella of some influential state members) and under such power, these kind of "threats" will be eradicated very efficiently, especially with Bilibili which is partially "state-monitored" (not sure if sponsored) due to the presence of the channel of the official youth branch of CCP is opened there [1].

Thus, relocating your repository to gangs aren't safe at all.

A rule of thumb: The prerequisite for your business to survive in China is that you need to have powerful affiliates, and make sure you do not piss off that powerful guy.

[1]: https://space.bilibili.com/20165629?from=search&seid=1277417...

DarthGhandi5y ago

It's hilarious how people on this site reacts at the mere mention of China on any topic. Everyone is apparently an expert.

Gitee has hosted a youtube-dl mirror for years.

https://gitee.com/mirrors/youtube-dlg

2 more replies

wener5y ago

You need to give you phone number to use there service, it's China.

m-p-35y ago

I made a read-only backup on IPFS in the likely case it gets DMCA'd.

git clone https://ipfs.io/ipfs/QmVJ6BtoavbWRJwWH8JmTd5Bf6i3zEzsecnBKTM...

nextaccountic5y ago

Does it includes issues and open PRs? I think this is the most concerning part about Github: when you move somewhere else you lose important parts of the history of the project (that document known bugs, how features came to be, etc), unless you specifically step up and use some import tool.

There are Git-backed distributed issue trackers like Git-bug https://github.com/MichaelMure/git-bug (and probably other tools) that should be more used.

One could perhaps convert Github issues to git-bug and store on a branch of this IPFS Git repository.

michaelmure5y ago

It's the whole point of git-bug: breaking the vendor locking.

There is a Github bridge that will import the issues (including the complete history). PRs are not supported yet though.

mikece5y ago

While this situation is a reason to look more closely at GitLab due to the ability to host your own instance, why not also look at a solution like Fossile: DVCS, wiki, and issue tracker all in a cross-platform, single executable file?

https://fossil-scm.org/home/doc/trunk/www/index.wiki

edjrage5y ago

Also created by the same person as SQLite. The repo itself is a SQLite database. It's an alternative to git, not just to GitHub:

https://www.fossil-scm.org/home/doc/trunk/www/fossil-v-git.w...

enahs-sf5y ago

Anyone else here think Downloading an executable as root from a URL that follows redirects is a bad idea? Seems like they could break the (Unix) install down into a few more steps that take some caution

Gaelan5y ago

I've cloned this to my laptop. I suggest that everyone does the same.

spurgu5y ago

Curious in general how often Youtube breaks things on their end, i.e. how long roughly this specific version of youtube-dl is expected to keep working without any active development?

input_sh5y ago

git pull the repo, cd into it, and do `git log --follow -- youtube_dl/extractor/youtube.py`.

spurgu5y ago

Thanks. That would indicate roughly 2 months or so...

mrspeaker5y ago

I use youtube-dl fairly infrequently... maybe once a month or so. Nearly every time I used it it would not download the video I wanted - so I'd do a "youtube-dl -u", then it would work.

Just spreading the repo around isn't going to help for very long.

1 more reply

Zhyl5y ago

I with there was an issue tracking thing that you could clone as easily as a git repo. My first thought was email as an ongoing distributed solution, but as far as I know you can't just download an archive of issue tickets, bug reports etc into a maildir without a fair amount of faff?

aabbcc12415y ago

And there is GitCenter, which us git (with issue tracking) over Zeronet (a p2p network)

ekianjo5y ago

Any centralized code versioning system can easily be DMCA'd so does it change anything?

AndrewDucker5y ago

Only one in the USA

bArray5y ago

How do you verify this isn't malicious? Who is running this repo?

mschuster915y ago

Compare the sha commit hashes of the top commit against the hashes from the old "true" repo. If they match the repo (and the history) has not been changed. Subsequent commits can be manually audited.

bArray5y ago

> Compare the sha commit hashes of the top commit against

> the hashes from the old "true" repo.

Isn't it just SHA1? I think it's generally accepted it's not secure...

Also it would be quite easy to build a clone repo that looks the part with all the correct hashes, the git "database" structure is quite simple.

j / k navigate · click thread line to collapse

36 comments

_bxe25y ago

INTPenis5y ago

So true.

But to me as a long time Gitlab fan this shows the power of using Gitlab. Because if they were using Gitlab when the DMCA hits they could have easily migrated to their own Gitlab instance.

While Github would have to export things into another format. I'm sure people have worked on exporting data from Github to Gitlab but it seems much easier to just use Gitlab.

qwertox5y ago

> could have easily migrated to their own Gitlab instance

Not sure if I would do that. Wouldn't that make me directly a target for the companies issuing the DMCA?

INTPenis5y ago

You're always a target of DMCA takedown requests, but having the freedom to host yourself anywhere with federation to other instances would give you the freedom to ignore it.

corobo5y ago

You're already a direct target of the DMCA if you control the code. They go through GitHub for ease.

Yes, if you host this anywhere without safe harbour protection you're going to get the DMCA takedown directly

input_sh5y ago

Gitee.com is a popular Chinese alternative. I'm pretty sure it'd be safe there.

yorwba5y ago

[1] a collection of cases: https://github.com/HiddenStrawberry/Crawler_Illegal_Cases_In...

oefrha5y ago

> China has extremely strict anti-scraping laws

It might be the case, but the repo you linked doesn't support that claim very well, and the cases cited are largely irrelevant to the case at hand.

"Forbidden area #2: scraping and sale of personal info". Common sense, irrelevant.

1 more reply

stevefan19995y ago

I don't think so. Youtube-DL can rip some popular Chinese VOD websites such as Bilibili and iQIYI, thus will first be "subjectable" to the Chinese equivalent of DMCA.

Thus, relocating your repository to gangs aren't safe at all.

A rule of thumb: The prerequisite for your business to survive in China is that you need to have powerful affiliates, and make sure you do not piss off that powerful guy.

[1]: https://space.bilibili.com/20165629?from=search&seid=1277417...

DarthGhandi5y ago

It's hilarious how people on this site reacts at the mere mention of China on any topic. Everyone is apparently an expert.

Gitee has hosted a youtube-dl mirror for years.

https://gitee.com/mirrors/youtube-dlg

2 more replies

wener5y ago

You need to give you phone number to use there service, it's China.

m-p-35y ago

I made a read-only backup on IPFS in the likely case it gets DMCA'd.

git clone https://ipfs.io/ipfs/QmVJ6BtoavbWRJwWH8JmTd5Bf6i3zEzsecnBKTM...

nextaccountic5y ago

There are Git-backed distributed issue trackers like Git-bug https://github.com/MichaelMure/git-bug (and probably other tools) that should be more used.

One could perhaps convert Github issues to git-bug and store on a branch of this IPFS Git repository.

michaelmure5y ago

It's the whole point of git-bug: breaking the vendor locking.

There is a Github bridge that will import the issues (including the complete history). PRs are not supported yet though.

mikece5y ago

https://fossil-scm.org/home/doc/trunk/www/index.wiki

edjrage5y ago

Also created by the same person as SQLite. The repo itself is a SQLite database. It's an alternative to git, not just to GitHub:

https://www.fossil-scm.org/home/doc/trunk/www/fossil-v-git.w...

enahs-sf5y ago

Gaelan5y ago

I've cloned this to my laptop. I suggest that everyone does the same.

spurgu5y ago

Curious in general how often Youtube breaks things on their end, i.e. how long roughly this specific version of youtube-dl is expected to keep working without any active development?

input_sh5y ago

git pull the repo, cd into it, and do `git log --follow -- youtube_dl/extractor/youtube.py`.

spurgu5y ago

Thanks. That would indicate roughly 2 months or so...

mrspeaker5y ago

I use youtube-dl fairly infrequently... maybe once a month or so. Nearly every time I used it it would not download the video I wanted - so I'd do a "youtube-dl -u", then it would work.

Just spreading the repo around isn't going to help for very long.

1 more reply

Zhyl5y ago

aabbcc12415y ago

And there is GitCenter, which us git (with issue tracking) over Zeronet (a p2p network)

ekianjo5y ago

Any centralized code versioning system can easily be DMCA'd so does it change anything?

AndrewDucker5y ago

Only one in the USA

bArray5y ago

How do you verify this isn't malicious? Who is running this repo?

mschuster915y ago

Compare the sha commit hashes of the top commit against the hashes from the old "true" repo. If they match the repo (and the history) has not been changed. Subsequent commits can be manually audited.

bArray5y ago

> Compare the sha commit hashes of the top commit against

> the hashes from the old "true" repo.

Isn't it just SHA1? I think it's generally accepted it's not secure...

Also it would be quite easy to build a clone repo that looks the part with all the correct hashes, the git "database" structure is quite simple.

j / k navigate · click thread line to collapse