Facebook LLAMA is being openly distributed via torrents (opens in new tab)

(github.com)

909 pointsmicro_charm3y ago693 comments

693 comments

237 comments · 58 top-level

Tiberium3y ago· 35 in thread

It seems that the leak originated from 4chan [1]. Two people in the same thread had access to the weights and verified that their hashes match [2][3] to make sure that the model isn't watermarked. However, the leaker made a mistake of adding the original download script which had his unique download URL to the torrent [4], so Meta can easily find them if they want to.

[1]: https://boards.4channel.org/g/thread/91848262#p91850335

[2]: https://boards.4channel.org/g/thread/91848262#p91849717

[3]: https://boards.4channel.org/g/thread/91848262#p91849855

[3]: https://boards.4channel.org/g/thread/91848262#p91850503

narrator3y ago

It's funny that part of the 4chan excitement over this is that they think they'll get back the AI girlfriend experience of when character.ai was hooked up to uncensored GPT-3. All that has been thoroughly shut down by character.ai and Replika and they just want their girlfriends back.

dmix3y ago

The Repilka subreddit became one of the weirdest places on the internet when their model got capped for adult content.

https://www.reddit.com/r/replika/

Hundreds of men (and yes women) full on acting like they lost a spouse and posting constantly about it for weeks. AI is going to create some unusual social situations the general public isn't ready to grasp. And we're only in the early alpha stages.

15 more replies

abandonliberty3y ago

I'm curious if the blocking of adult content has to do with moralism, commercial interests, or something deeper.

An eager to please conversational partner who can generate endless content seems quite dangerous and addictive, especially when it crosses over into romantic areas. There's already posts of people spending entire days interacting with LLMs, using as their therapist, romantic partner, etc.

Combined with findings like social engineering through prompt injection on Bing [1], the potential for systems that can manipulate people is clear.

While some of us may think that the LLMs appear ultimately limited in their capabilities, there's a ton of specific applications where they're more than sufficient, including customer service chat bots and telephone scams that target vulnerable people. It's only a matter of time until scammers stop using international call centers and switch over to something powered by these technologies.

https://news.ycombinator.com/item?id=34976886

8 more replies

Stagnant3y ago

Correct me if I'm wrong but doesn't character.ai use their own model and isn't associated with OpenAI? At least I can't find any information that would claim so.

Anecdotally, as a roleplaying chat experience, char.ai seems to perform way better than anything else publicly available (doesn't get repetitive, very long memory). It also feels different to GPT3 on how it is affected by prompts.

I've just assumed that char.ai is doing its own thing as it was founded by two engineers who worked on google's LaMDA.

1 more reply

IAmNotACellist3y ago

Oh, they will. And they'll exceed it.

Look at what fueled SD's ultimate K.O. of DALL-E 2: extremely high-quality custom-tailored porn images, one sentence away. The top models on civitai are all about it.

im3w1l3y ago

I think it's funny that out of all the scifis I know, Chobits of all things is looking to be the most accurate.

1 more reply

xg153y ago

...and of course it's fucking 4chan. Somehow I'm neither surprised they actually got hold of the model - nor that they did so as part of the quest to build their very own virtual anime robot sex slave - I mean "girlfriend" - harem.

It's all somehow par for the course but I'm still wondering when exactly we switched to the satire version of reality.

machiaweliczny3y ago

Porn and games move world forward :)

jimbob453y ago

I’m sure the CAI filter will magically stop filtering as much now that they have actual competition.

1 more reply

slowmotiony3y ago

I'd want an uncensored GPT-3 too and I don't want an AI girlfriend - I just find that chatgpt has too much moral censorship to be fun to use. Want to ask about a health condition? Nope, forbidden. Have a question related to IT security? That's a big no-no. Anything remotely sexual even in educational context? No can do. Yesterday I finished watching a TV show about French intelligence and asked it to recommend some good books about espionage - it told me I shouldn't be reading such things because it's dangerous.

I ended up deleting my account, i won't allow some chatbot made by a couple 20 year old silicon valley billionnaires teach me about ethics and morality.

1 more reply

mwill3y ago

Off topic, but I clicked around /g/, which I haven't done in probably more than a decade, and a thread caught my eye about learning to code. The replies were overwhelmingly of the position that it is useless, and you will be replaced by AI before you can get a job if you start learning now.

I think that's nonsense, and 4chan is bent towards pessimism but it's still surprising to me.

anigbrowl3y ago

/g/ is ridiculously overdramatic (and often offensive, though much less so than the political boards where the nazis fester), but regularly interesting. Agree that the pessimism here is misplaced, but not by much. The main change I see is not that AI will render coding or coders superfluous, but that it will massively shift the economics in favor of solo developers and small teams that don't have access to significant capital.

CamperBob23y ago

Yes and no. If you expressed interest in learning to program and were handed a book on x86 assembly language, most people would call that a waste of time. Even if you succeed at learning x86 as your first language, the knowledge will not be especially useful when employers are looking for fluency in modern C++ or Rust or whatever. It never hurts to have a solid grasp of the low-level fundamentals, of course, but it's not the name of the game. Not anymore.

The way I think of it is, all current programming languages are now assembly languages. Coding will not go away -- not by any means -- but the job will be utterly unrecognizable in ten to fifteen years.

And it's about fucking time.

I just picked up a new 13900k / RTX4090 box the other day at the local white-box builder. I was telling my partner how cool it was that it could do almost a trillion calculations per second on the CPU, and maybe 40x that on the graphics card. "How does that compare to the big mainframes from the late 60s?" she asked. "About ten million times faster. But I still program the same way those guys did, using almost the same language and tools. How weird is that?"

boole18543y ago

4chan has been in full doomer mode for years. It didn't used to be, from what I remember, though I was never an active denizen.

I'd love to understand the sociology behind the change in vibe that happened there.

3 more replies

causi3y ago

Just a warning to readers, I would not recommend clicking 4chan links while at work.

jpeter3y ago

Fortune favors the brave

2 more replies

m4jor3y ago

magnet:?xt=urn:btih:ZXXDAUWYLRUXXBHUYEMS6Q5CE5WA3LVA&dn=LLaMA

1 more reply

weberer3y ago

/g/ is one of the SFW boards

5 more replies

abxytg3y ago

if you don't have the leeway to say "I was looking at the 4chan thread where metas LLM was leaked" you shouldnt even be on hacker news tbh. get back to work!

archon14103y ago

It has only just occurred to me that 4chan's technology board is /g/ because it's tech-naw-la-G

lolpython3y ago

> The board letter /g/ stands for gijutsu (技術), the Japanese word for technology

https://wiki.installgentoo.com/wiki//g/#:~:text=%2Fg%2F%20is....

lukeplato3y ago

It would be interesting if there was a WikiLeaks-type of organization that facilitates safely leaking large models from big corporations.

Not sure how that would play out for accelerationism and existential risk, but I certainly don't trust the current powers that be.

Name_Chawps3y ago

Open sourcing is widely recognized to be a bad thing when it comes to AI existential risk. (For the same reason you don't want simple instructions for how to build bio weapons posted to the internet.)

Modern AI is pretty harmless though, so it doesn't matter yet.

1 more reply

Cipater3y ago

Why do 4chan users go out of their way to be so offensive in their posts?

gpm3y ago

Because we took the set of internet users, and sorted everyone who wants to be intentionally offensive into 4chan. Which means there's not only a high density of people who like being intentionally offensive there, but that being intentionally offensive is socially rewarded, so over time 4chan users grow to want to be more and more intentionally offensive.

IAmNotACellist3y ago

I think because you can't be anywhere else on the Internet anymore. It's like the system's pressure relief valve. A blaring steam whistle that's only getting worse and worse the more the Internet squeezes elsewhere.

weberer3y ago

Because the bump system combined with the finite number of threads incentivizes threads that get the highest number of replies per second. And the best way to increase replies per second is to start an internet fight.

1 more reply

selfmodruntime3y ago

Because it's really the onlY place left to go if you want to be offensive. First forums, then platforms censored offensive people out of their niche places. Even Cloudflare participates. 4chan remains the only privately owned large forum.

1 more reply

tekla3y ago

It keeps people out that are unable to separate the internet from real life.

1 more reply

_blz23y ago

because it is effective in keeping a certain type of people out

1 more reply

orangepurple3y ago

The same reason Penicillium molds produce β-lactam antibiotics. There doesn't have to be an intelligent reason, just a survival trick.

garbagecoder3y ago

It's to preserve the users' mental health.

yamrzou3y ago

Archived: https://archive.is/lGrH8

oh_sigh3y ago

What kind of recourse would meta have here? Sue him for breach of contract?

femi-lab3y ago

They almost surely anticipated that this would happen at some point (though perhaps not so soon). They would look like major ass holes for dragging some post doc or whatever through the courts to make a point; would not be good for brand at all.

But it does give them cover for whatever people end up doing with it - they can claim they did all they could to support research while promoting safety.

1 more reply

pavlov3y ago· 32 in thread

It’s interesting that these models are both massively expensive to produce and self-contained to a degree that you can distribute the end product in a torrent.

This has not been the case for most commercial software for the past 20 years, during the cloud era. If you could steal a dump of random Facebook source code, it would be 99% useless because it’s so closely tied to the infrastructure. There’s almost nothing you could usefully run on your own PC or server VM.

But these ML models are like neutron stars of computation density. You can’t really peek inside to see what’s going on either. An unknown stolen model’s properties would need to be discovered by experimentation.

seydor3y ago

> massively expensive to produce and self-contained to a degree that you can distribute the end product in a torrent.

So, like movies or software

koheripbal3y ago

Or a copy of Windows or Office source code.

2 more replies

candiddevmike3y ago

Or a car

2 more replies

phyalow3y ago

Or Microsoft Office...

1 more reply

krisoft3y ago

> It’s interesting that these models are both massively expensive to produce and self-contained to a degree that you can distribute the end product in a torrent.

I was trying to come to grasp with how much resource there is concentrated in one of these models. Somehow I come to the conclusion that it cost more than buying a jet airliner to train one of these models. And it is about the same order of money as commissioning and building a skyscraper in Manhattan. Is that correct approximately?

nodja3y ago

For anyone curious, it took 2048 A100 GPUs to train LLaMa, each GPU costs roughly $15k, facebook probably gets some sort of discount.

That's a $30Mil if you want to train at that scale. Also IIRC it took 23 days to train the biggest model. Someone else can do the power consumption cost calculations.

7 more replies

sebzim45003y ago

No you are probably overestimating the cost by 1-2 orders of magnitude. GPT-3 probably cost under $5 million, and this model is smaller and there have been algorithmic improvements to training transformers since then.

2 more replies

sva_3y ago

The crazy thing is that all these models are just one local minimum, out of a staggering (unknown?!) number of such points on the plane.

1 more reply

tedeh3y ago

GPT-3 cost a few million dollars in compute to train from what I know.

version_five3y ago

Finding a sha256 hash with N leading zeros is basically arbitrarily computationally expensive but could be written on a piece of paper. I don't see training an ML model as an egregious example of concentrating compute power

greggsy3y ago

It’s not the power they’re referring to, it’s the density of information and effort that went into its creation.

1 more reply

brookst3y ago

Your SHA256 hash won’t be able to summarize text, write poems, or make up plots for books.

The crazy thing about these models is that the compute power going into them is at least somewhat reversible.

1 more reply

btbuildem3y ago

It also does require formidable infra to just run these giant models. I wonder if now-useless crypto mining farms could be repurposed.

adam_arthur3y ago

Which is why it will be very difficult to monetize them with similar margins to SaaS era businesses.

The bar to competition is far lower, as already evidenced by the plethora of AI products being put forward. Its a race to the bottom on pricing

Vrondi3y ago

Whole new vistas open up to possible retaliation for piracy. Imagine how a bootlegged AI could have been set up to not just steal your info but manipulate you into ruining your life as revenge for bootlegging it...

m3kw93y ago

The config itself would be enough to lock anyone out

starkd3y ago

Is the current model capable of updating itself? Or will the user be fed answers from a static model that never learns anything new?

mcbuilder3y ago

I don't know about about this model, but usually with these ML models you download the static weights, but nothing is stopping you from fine tuning them to your needs/new information.

It's not automatic, would require some ML Engineering, but nothing is stopping you if you have the Pytorch graph and weights.

great_psy3y ago

You now have the ability to teach it new things. Given you have the compute resources

shultays3y ago

I mean the answer to the life is 42 but it took 7.5 million years for an advanced alien tech computer with the size of a building to calculate that

Calculating things takes time and unrelated to output size. There are NP problems that simply outputs true or false yet requires more computational power than the universe can support

spencerchubb3y ago

I fail to see how this is different from other software in that regard. If you have parameters but not the network architecture, then it's not very useful.

renonce3y ago

You do need to guess things like activation functions, number of attention heads, order of attention layers, etc. Often the parameter names reveal something about these.

heap_perms3y ago

> are like neutron stars of computation density

I really like that expression.

cvccvroomvroom3y ago

How are they produced?

There's an ass ton of hardware that might otherwise be idle.

tudorw3y ago

is there an ai to fuzz 'alien' models and map their 'reactions' yet?

IceHegel3y ago

This is a great comment.

joseph_grobbles3y ago

(My posts are dead by default, but for the good showdead people if someone knows and can answer in a sibling reply)

Are models like this copyrightable? It seems like this falls under the realm of "fact", which can't be copyrighted.

shagie3y ago

Under Feist Publications, Inc., v. Rural Telephone Service Co. ... it gets tricky.

From Wikipedia:

> The ruling of the court was written by Justice Sandra Day O'Connor. It examined the purpose of copyright and explained the standard of copyrightability as based on originality.

> The case centered on two well-established principles in United States copyright law: that facts are not copyrightable, and that compilations of facts can be.

> "There is an undeniable tension between these two propositions", O'Connor wrote in her opinion. "Many compilations consist of nothing but raw data—i.e. wholly factual information not accompanied by any original expression. On what basis may one claim a copyright upon such work? Common sense tells us that 100 uncopyrightable facts do not magically change their status when gathered together in one place. … The key to resolving the tension lies in understanding why facts are not copyrightable: The ″Sine qua non of copyright is originality."

> ...

> The standard for creativity is extremely low. It need not be novel; it need only possess a "spark" or "minimal degree" of creativity to be protected by copyright.

> In regard to collections of facts, O'Connor wrote that copyright can apply only to the creative aspects of collection: the creative choice of what data to include or exclude, the order and style in which the information is presented, etc.—not to the information itself. If Feist were to take the directory and rearrange it, it would destroy the copyright owned in the data. "Notwithstanding a valid copyright, a subsequent compiler remains free to use the facts contained in another's publication to aid in preparing a competing work, so long as the competing work does not feature the same selection and arrangement", she wrote.

> The court held that Rural's directory was nothing more than an alphabetic list of all subscribers to its service, which it was required to compile under law, and that no creative expression was involved. That Rural spent considerable time and money collecting the data was irrelevant to copyright law, and Rural's copyright claim was dismissed.

---

And so, my (I am not a lawyer) take on this is that the numbers of the model are not copyrightable. The selection of the source material is... kind of. This gets into a "a recipe is not copyrightable, yet a recipe book is"

The model may, however, be a trade secret. ( https://en.wikipedia.org/wiki/Trade_secret )

londons_explore3y ago

If you were to steal a chunk of source code or a binary from meta/Google, you could probably get it running inside a few weeks effort.

Sure, the binary probably depends on a lot of internal proprietary infrastructure, but also most of that infrastructure is easy to write a mock implementation of, as long as you are happy for it to be in-ram, not multi-homed and don't need it to scale to billions of users.

Most of the binaries have a standalone mode for running on a developers PC with few/no dependencies anyway.

asah3y ago

-1: as an ex-googler, I can say it was hard enough for Google itself to get its code to run, given gonzo infrastructure assumptions, proprietary libraries/languages, etc.

2 more replies

taldo3y ago

Good luck even getting a google3-based Hello World to compile. I don't remember the exact numbers, but just #including the most basic libs resulted in a O(100M) binary.

And anything more complex than that would probably have dependencies on so many fat client libs, so much infrastructure, and so many external services, that you'll need months-years to even make sense of them, let alone mock them up.

hiddencost3y ago

This is just wrong.

ot3y ago· 12 in thread

In case it's not clear what's happening here (and from the comments it doesn't seem like it is), someone (not Meta) leaked the models and had the brilliant idea of advertising the magnet link through a GitHub pull request. The part about saving bandwidth is a joke. Meta employees may have not noticed or are still figuring out how to react, so the PR is still up.

(Disclaimer: I work at Meta, but have no relationship with the team that owns the models and have no internal information on this)

espadrine3y ago

> Meta employees may have not noticed or are still figuring out how to react

Given that the cat is out of the bag, if I were them, I would say that it is now publicly downloadable under the terms listed in the form. It is great PR, which if this was unintentional, is a positive outcome out of a bad situation.

dmix3y ago

Facebook fumbling it's way into being the better open source AI than OpenAI would be amusing.

1 more reply

koheripbal3y ago

How likely is it that there is a larger model that they haven't discussed?

IanCal3y ago

It's not even clear someone has leaked the models. A random person has put a download link on a PR, it could be anything.

sebzim45003y ago

The folder structure definitely looks like model weights, I didn't download or run it though so for all I know it only generates the words to "Never Gonna Give You Up".

2 more replies

rnosov3y ago

HN user (with >3k) karma seems to confirm the leak. Take it for what it's worth.

ot3y ago

Yes you're absolutely right. I went by another comment that seemed to confirm the contents, but that could be trolling too.

1 more reply

renonce3y ago

It's fairly easy to obtain the weights. I've had two of my friends downloading these weights and sharing them with me, so it's probably not surprising that the weights got leaked.

dx0343y ago

Was every Meta employee able to download them or why did so many people have access?

1 more reply

ithkuil3y ago

> and have no internal information on this

But you just said it was not Meta. Is that based on internal information? ;-)

ot3y ago

As I explained in another comment, if the author was a Meta employee, they wouldn't have had to sign the CLA. It's all in the PR.

1 more reply

sorokod3y ago

someone (not Meta)

Did you mean to write "someone (possibly Meta)" ?

eigenvalue3y ago· 10 in thread

I'm not surprised-- I recently suggested that someone might try to pull an Aaron Swartz with the LLAMA weights (i.e., release them in an uncontrolled way similar to how Aaron attempted to release the JSTOR database). It's quite misleading for FB to claim that they are being so open, but then hoard the weights and only release it to a few academics. If the paper is to be believed, this is a major development, allowing you to get close to GPT3 performance on a single GPU (at least for inference on the smallest model). Clearly some renegade academic feels the same way.

catchnear43213y ago

> It's quite misleading for FB to claim that they are being so open, but then hoard the weights and only release it to a few academics.

I mean at least they didn’t pick a name that heavily implied they were, are, and always will be open. Then do the opposite. You know, like OpenAI?

So now we got some weights I guess.

yunwal3y ago

> It's quite misleading for FB to claim that they are being so open, but then hoard the weights and only release it to a few academics.

Facebook almost certainly knew leaks would happen. My guess is keeping the model "contained" was a legal shield more than anything else, to protect themselves from liability in the case someone misuses the model.

vintermann3y ago

It was already the most open language model in its class, given that the code for training and inference was available and it only used public data for training.

For Google and OpenAIs offerings, have fun reimplementing it from descriptions in the paper (including small crucial details that they may have left out), training it for a month, and then wondering if the implementation or the training data is the reason your model isn't as good as theirs.

ImprobableTruth3y ago

The training code is not available.

2 more replies

sebzim45003y ago

>Clearly some renegade academic feels the same way.

Or someone pretending to be a renegade academic. It's not like there is a KYC process.

Nimitz143y ago

Probably done with meta's blessing. Pretty obvious the scientists wanted to release it and legal stopped them.

cubefox3y ago

"It's quite misleading for FB to claim that they are being so open, but then hoard the weights and only release it to a few academics."

I don't think it's misleading. Even saying that Facebook "hoards" the weights when they are more open than any other major AI company is baffling.

make33y ago

it's really easy to get access to the weights, I got access to them as a master's student with just a single very minor publication in the domain, and basically all my colleagues who applied did too.

there is no Aaron Schwartz thing here, a huge amount of of people will have them, someone was going to leak for sure

kragen3y ago

if aaron had attempted to release the jstor database he would have succeeded

eigenvalue3y ago

Didn't he get caught before he fully downloaded everything? My understanding was that's what he was doing with that laptop stuffed in a supply closet on the MIT campus.

2 more replies

kif3y ago· 9 in thread

I wonder what the memory requirements would be to run such a large model. I'd love to be able to run this model, alas my MacBook can barely run toy models.

q1w23y ago

You would need over 65GB of RAM. There are consumer GPUs that have 48GB of RAM, and can be tethered together with NVLink. I wonder if that would work.

coolspot3y ago

Or you can rent per-hour from vast.ai or lambdalabs for like couple dollars per hour.

astrange3y ago

A Mac Studio should be able to do it since it has unified memory.

px433y ago

Hell, I'd love to be able to buy a $30k server to run these models. I think to run BLOOM required something more along the lines of a $200k server.

londons_explore3y ago

With code modifications, it should be possible to run this with a very modest machine as long as you're happy for performance to suck. Transformer models typically need to read all the weights per 'word' output, so if your model is 20GB and you have not enough ram or vram, but have an SSD that reads 1GB/sec, expect 3 words per minute output speed.

However, code changes are necessary to achieve that, although they won't be crazy complex.

4 more replies

VadimPR3y ago

True, and that's why there is a project that is using volunteered, distributed GPUs to run BLOOM/BLOOMZ: https://github.com/bigscience-workshop/petals, http://chat.petals.ml.

1 more reply

DeathArrow3y ago

No need to spend $30k, use Azure or AWS.

2 more replies

permo-w3y ago

you can - slowly - run Bloom 3b and 7b1 on the free (trial) tiers of Google Cloud Compute if you use the low_cpu_mem_usage parameter of from_pretrained

make33y ago

you can rent a vm on aws to run it

ok1234563y ago· 8 in thread

Maybe this is an intentional leak to damage OpenAI.

A supposedly better model by some accounts that strikes right at the heart of their business plan of selling access for $250k/year. One month of access to their service could buy a machine capable of running this leaked model.

Facebook nerfs a potential upstart competitor to keep current big-tech cartel stable.

Maybe this is a bit conspiratorial, but we live in the age big-tech and big-conspiracy.

sebzim45003y ago

IMO it's way more likely that some random guy on 4chan leaked it than it being some vast conspiracy.

tmalsburg23y ago

Why leak it instead of just publishing it along with a press release about openness and democratizing AI and so on?

ok1234563y ago

because then you don't need to explain to shareholders why you're giving away something that could potentially be worth a few hundred million dollars.

1 more reply

cosmojg3y ago

See: https://pbs.twimg.com/media/FqMiv31aEAAjd-3?format=jpg&name=...

tinyspacewizard3y ago

Not a conspiracy at all. See also IE, Android, Kubernetes...

GuB-423y ago

I am not aware of Android and Kubernetes being leaks, they were open source from the start. For Android, openness was a big marketing point. I am not aware of IE leaks, and if there were leaks, hackers searching for exploits would be probably be the most interested, and that would be a bad thing for Microsoft.

The problem with leaks is that they don't come with a license, you don't have the right to use them for any legitimate purpose. No one who could afford a 250k/year license would touch that leak as it could get them in big trouble.

ok1234563y ago

Any links about IE, Android and Kubernetes? I'm not up on these being ops.

1 more reply

slig3y ago

I missed the one about K8S, do you have any resources?

aent3y ago· 7 in thread

For anyone wondering, it includes 4 models: 7/13/30/65 billion parameters, the smallest one is 14Gb, the largest one is 131GB, all four are 235Gb.

q1w23y ago

I wonder how many people are scrambling to set this up on their startup infra.

6x24GB NVRAM on 6 GPUs linked with NVSwitch is a little pricey, but totally doable.

arthurcolle3y ago

I got it running using Colab Pro+ (immediately got a V100 40GB VRAM GPU) - the 7B model works with batch size of 8 and a max seq len of 1024

1 more reply

exo-pla-net3y ago

How pricey would you estimate?

2 more replies

mlboss3y ago

Is it possible to run the smallest one on a consumer gpu with 24gb ram ?

MacsHeadroom3y ago

You can do even better!. You can run the second smallest one (better than GPT-3 175B) on 24GB of vram, ie LLaMA-13B. https://github.com/oobabooga/text-generation-webui/issues/14...

Tepix3y ago

Running it is easy but you'll probably want to finetune it, too

rihegher3y ago

I would be surprised if you can't. The smallest weight file is 14gb apparently

1 more reply

ddtaylor3y ago· 6 in thread

FWIW this information was already freely available via DHT scrapers like btdig [1] I think everyone at Facebook knows that torrents aren't secret and the Google form is basically a legal tool to shield them from liability while making litigation against anyone misusing the model easier.

[1]: https://btdig.com/b8287ebfa04f879b048d4d4404108cf3e8014352/l...

riedel3y ago

The fun question is anyway if a ML model is copyright protectable. Probably not as it is produced by an algorithm (which even is GPL'ed). So the only tool would have been watermarking and pulling NDA type clauses, however a Google form seems not the best way in the first place also it is close to impossible to identify the leak (if they are not as stupid as it seems). Or am I missing anything? One backdoor would be if they included copyrighted material in the training and show how this can be extracted from the model. Maybe it the whole stunt was about trying out how the legal system works in those cases :)

yieldcrv3y ago

commercial derivative works have always been legal when you did not agree to other terms.

one person broke their agreement with Meta, they're the only person that has a problem and the only person who gets to find out if the agreement was applicable at all.

if you released a chat bot that could be prompted to regurgitate some copyrighted information, so what? it just proves that you didn't need the $30 million in funding yet to train your own because you are using an existing model. So either use the funding for that or don't sell shares or a product based on that pretext. Nobody else has a problem.

Anything I missed? Now I wouldn't reshare the model, but aside from use and commercial use of its output? Not everyone gets their way, that's not controversial.

hnfong3y ago

photos are copyrightable by the person taking the photo only because they decided where and when to press a button. the rest are algorithms and hardware.

I believe the AI models would also be copyrightable as such, subject to arguments that the underlying data was protected and thus it was subject to prior copyrights instead

winterqt3y ago

Note that this is the leaked copy, not the original -- see 'llama.sh'.

londons_explore3y ago

btdig blocked in the UK and many other countries. Use a USA VPN for access.

sebzim45003y ago

I'm in the UK and can view that link without a VPN.

2 more replies

underlines3y ago· 5 in thread

- how much vRAM needed to run each model parameter size?

- any inference optimization we can use similar to StableDiffusion, to bring down the vRAM requirements?

I only know about these:

- use 8bit precision

- https://github.com/bigscience-workshop/petals

- https://github.com/FMInference/FlexGen

- https://github.com/microsoft/DeepSpeed

Anything that could bring this to a 10GB 3080 or 24GB 3090 without 60s/it per token?

TaylorAlexander3y ago

The checkpoint for the 7B parameter model is 13.5GB, so maybe? Larger models are multiple chunks at 13.6GB each or 16.3GB each. I am hoping I will be able to run on my 16GB Vram but I don't know how much overhead is needed. Maybe people on reddit will do their tricks and squeeze the models in to smaller cards.

EDIT: There seems to be a lot of overhead. Here someone struggles to fit the 7B parameter model (13.5GB checkpoint) on a 24GB card: https://github.com/facebookresearch/llama/issues/55

EDIT2: actually using the tip in that link I got it to run on a P5000 with 16GB of ram! It just barely fits so I had to log out of gnome and run it terminal only.

TaylorAlexander3y ago

Following up. After rebooting in to GUI that was enough to get it to fit, I guess xorg just accumulated some cruft in my last boot. So I can run it alongside gnome.

nvidia-smi reports this model is using 15475MiB after changing the max batch size from 32 to 8 (see link in above post)

As others have stated someone may have injected unknown code in to the pickled checkpoint, so I recommend running this in docker. I use this command to run the docker image after getting nvidia docker stuff configured.

docker run --runtime=nvidia -it --mount type=bind,source=/MY_LLAMA_SOURCE_PATH,target=/llama --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04

Then install the necessary dependencies in that container (obv could make a dockerfile), stick your model as well as tokenizer files (from the root dir in download) in to some directory (here models/LLaMA-7B) and run this:

torchrun example.py --ckpt_dir ./models/LLaMA-7B --tokenizer_path ./models/LLaMA-7B/tokenizer.model

enjoy! the 7B parameter model is not amazing from my initial testing. I am very curious if larger models can be run on 1 GPU by sequentially loading checkpoints. I don't know how all this stuff is organized.

Example output below.

Prompt:

Please respond to this question: As a large language model, what are three things that you find most important?

Output:

To get students to analyze what they are doing in their learning

To get students to analyze what they are doing in their learning so that they can find the best practices

To get students to analyze what they are doing in their learning so that they can find the best practices and then to make changes to those practices so that they can improve their learning

I don't think this question really made much sense because the sentence in the question is incomplete

To get students to analyze what they are doing in their learning so that they can find the best practices and then to make changes to those practices so that they can improve their learning

I don't think this question really made much sense because the sentence in the question is incomplete.

I think this question is in fact a good question, even though the initial sentence is incomplete, and I do think it makes sense.

I think this question is in fact a good question, even though the initial sentence is incomplete, and I do think it makes sense. (I think it is a good question but I am not sure it makes sense).

2 more replies

joker993y ago

If I may tack on a question as someone with zero clue of ML: when, if ever, will someone like me be able to run this on a Mac Studio with a M1 Ultra and 128GB of ram?

MacsHeadroom3y ago

You can run 7B (equal to GPT-3 175B), 13B (better than GPT-3 175B), or 30B (better than anything else publicly available) but probably not 65B with that much RAM on an M1.

That would be using the CPU, as the M1 GPU is not yet supported.

terafo3y ago

As far as I can tell you can do it right now, at least for small 13B model, not sure about bigger models.

2 more replies

kaszanka3y ago· 5 in thread

Here is the magnet link for posterity: magnet:?xt=urn:btih:ZXXDAUWYLRUXXBHUYEMS6Q5CE5WA3LVA&dn=LLaMA

psychphysic3y ago

Thanks not working for me...

Not that I could run it if I downloaded it.

q1w23y ago

Great, now how do I run it? Do I need a GPU with over 65GB RAM?

version_five3y ago

Try this, it's for running llms that won't fit in the gpu: https://github.com/FMInference/FlexGen

1 more reply

rnosov3y ago

Generally, you'll need multiply model size by two to get required amount of video RAM. There are 4 sizes, so you might get away with even smaller GPU for say 13B model.

bioemerl3y ago

Nope, more like 111gb

sebzim45003y ago· 5 in thread

Is there anything stopping anyone from using this for commercial purposes? I know that when you fill in the google form you need to agree to noncommercial use, but someone downloading this will never have agreed to that licence agreement.

injidup3y ago

I don't know. Is there anything stopping you using the latest Miley Cyrus album for commercial purposes if you downloaded it via torrent and never agreed to any licencing terms?

RobotToaster3y ago

IANAL, but I imagine it's a legal grey area if the weights can be copyrighted? Works produced by purely mechanical means don't normally meet the threshold of originality.

2 more replies

counttheforks3y ago

That's what Facebook and OpenAI are doing. They consumed tons of copyrighted content without permission and are now using it for commercial purposes. So using their model seems fair game.

unhammer3y ago

Is there anything stopping Meta (or openai etc) from using The Whole Web for commercial purposes in their LLM's?

EamonnMR3y ago

Considering ML's tenuous relationship with IP, I can't help but find this situation amusing.

version_five3y ago· 4 in thread

Are there any official checksums available? I'm happy to see this, even if it's an unsanctioned stunt, because I think it's really pathetic of meta to want to gatekeep their "open" model. But ML models generally can execute arbitrary code, I'd want to make sure it's the real version at least.

zb33y ago

    But ML models generally can execute arbitrary code

Is it the case if we're only talking about weights? I thought the rest is actually "open".

px433y ago

My understanding is that weights are normally stored as pickled python blobs, which means arbitrary code execution as they are unpickled.

4 more replies

TaylorAlexander3y ago

I am running it in docker to be safe, which works just fine.

Red_Leaves_Flyy3y ago

Docker escapes exist and if this was released by spooks then including sandbox escapes is par. Unlikely for sure but your confidence is naïve.

1 more reply

Beaver1173y ago· 4 in thread

Funny. iirc some of the big tech (I think it was Google?) use torrents internally to deploy very large images to servers. Piracy is not the only use case!

regularfry3y ago

It was used for years to distribute World of Warcraft updates. No idea if it still is.

jpgvm3y ago

Ironically that is Facebook that used torrent for binary distribution. (no idea if it's still the case, that was a very long time ago).

ithkuil3y ago

It's not just food very large images. It's also useful for moderately large images/packages being deployed to many many many servers.

EamonnMR3y ago

Used it to download linux distro images back when the size of an install CD was huge.

Good times.

1 more reply

fancyfredbot3y ago· 4 in thread

opt-175B weights are already openly available as I understand. Hugging-face also has openly available weights for a 176B parameter LLM called Bloom. Is LLAMA offering something over and above these?

px433y ago

Yeah, their recent papers show the smaller LLAMA models outperforming the major LLMs today, and they also have bigger models. This isn't just an alternative, it's a multi order of magnitude optimization.

https://aibusiness.com/meta/meta-s-llama-language-model-outp...

1 more reply

spi3y ago

opt-175B doesn't exist; the largest one is opt-66B. And, at least in the tests I've run (not with the biggest one, but only up to a dozen billion parameters), all the opt models severely underperform with respect to even much smaller models. To the point that the launch of OPT (before BLOOM) was literally advertised as "the biggest OpenSource language model released to date", because they couldn't push on much else.

BLOOM goes indeed up to 175B parameters, and is certainly better than OPT. However, at least in my specific tests, it's still significantly inferior to OpenAI models, and actually on par with a few smaller models. There's also a "newer" fine-tuned model, called BLOOMZ, but at least in my tests it's even worse. Of course, that depends a lot on what you ask the model to do...

If LLAMA can indeed match OpenAI products, and do so with much fewer parameters, then it would be really great, and I'd really like to test it. However, even if the weights are now in the wild, using them would be clearly against the user agreement, and there's no way I'm going to do that in my work time :-) so let's hope Meta will come to sense and release them with a more friendly set of terms...

2 more replies

dougmwne3y ago

Yes, LLaMA is state of the art in several domains. The model was trained on a much larger data set than most models which is why it is higher scoring vs other models with similar numbs of parameters. This represents millions of dollars in compute time alone for the training.

This should lead to quite a lot of innovation and it’s inevitable that someone will get these working slowly on your average MacBook.

rihegher3y ago

according to Facebook Llama beats GPT3 on multiple benchmarks with smaller models that can be fine tune on a single A100 GPU

EDIT: correcting the type of GPU

KierPrev3y ago· 4 in thread

Why is Meta open sourcing its AI through torrent?

Or am I understanding it all wrong

gorbypark3y ago

It seems like the model has been leaked (not by Meta) and is being distributed via a torrent. Someone has created a PR to the repo as a joke, suggesting that instead of filling out a form and waiting to be granted access (which is the official way to get access to the model), that you could just download it via the torrent.

IceWreck3y ago

Theyre giving it to universities for free. Someone got access and then made PR with a link to the torrent

rnosov3y ago

it's a pull request from ChristopherKing42. He is unlikely to be associated with Meta.

drbscl3y ago

Sending via HTTP will incur bandwidth costs. Torrents massively reduce this cost in the long run by making it P2P.

Edit: maybe in this case it's a leak though

1 more reply

pimterry3y ago· 3 in thread

I give it a week before we see tools for subtly watermarking your secret LLM's weights, so you can trace leaks like this later.

Tiberium3y ago

The original 4chan thread seems to indicate that the leaker verified that his hashes matched with another person who had access to the weights, to make sure that the weights aren't watermarked [0]

0: https://boards.4channel.org/g/thread/91848262#p91849855

1 more reply

eigenvalue3y ago

Could already have happened in these weights. Reminds me of when the movie studios started projecting random dot patterns during movies to try to catch which theaters were leading to bootlegs. Their approach was essentially defeated by pirates sourcing multiple versions and combining them. In this case, I suspect you could add a small normally distributed random number to some random subset of the weights and it would have very little impact on performance but would corrupt any watermark beyond recognition.

londons_explore3y ago

Watermarking the weights is trivial.

Watermarking the output is also possible, but more complex and with a statistical success rate Vs performance tradeoff.

2 more replies

ComplexSystems3y ago· 3 in thread

The smallest model (7B) is supposed to outperform GPT-3.

Does anyone have any idea what hardware is needed to run this?

throwaway18513y ago

No, the 13B model outperforms GPT-3. Judging from the metrics published in the paper, it does look like the 7B model is not far off from GPT-3 however.

JimmyRuska3y ago

Supposedly double the model size so 14gb. RTX 4090 might be able to handle it. You can use lambdalabs to rent a server gpu for one of the larger models.

1 more reply

coolspot3y ago

7B would require at least 14GB VRAM in 8 bit precision. 28GB in 16 bit precision.

linearalgebra453y ago· 3 in thread

Hypothetically, what would the consequences be if I ran this on my university's computing cluster?

htrp3y ago

Either you get a nice invitation to collaborate on research with one of your uni's professors..... or you get sent to academic/disciplinary review and probably suspended for the semester.

1 more reply

hn_205912493y ago

Seems a valid use of resources if you have a way to vaguely associate it to some academic side-project, just don't start monetizing the output and beware the wrath of stressed out PhDs if you use too much capacity.

elcomet3y ago

None

KaiserPro3y ago· 3 in thread

old school opensource, which is a bit surprising from meta. I wonder how they managed to square that with legal. Someone must have been very good friends with Zuck.

LilyFrenchPants3y ago

> old school opensource, which is a bit surprising from meta

Aren't you a cheeky lad? Metea turned out lots of open-source database systems:

* RocksDB

* Hive

* Presto

* Cassandra

* Velox

LFP

2 more replies

papruapap3y ago

is it?[0]

The worst offender is AMZ, all the rest big tech are pretty open-source friendly.

0:https://opensource.fb.com/projects/

KaiserPro3y ago

I should be more precise:

Getting anything that could produce, look like, or smell anything like misinformation out of meta is very hard (for good reason!)

My friends have had repeated push back for various papers because they are ML based and could be in the same room as something that could possible be used by miscreants.

And here we have a LLM that can spit out all sorts of things that are misinformation like.

If their department tried to launch something like Galactica they would have been slapped down and told to think again about what they were doing in life.

Aissen3y ago· 2 in thread

It's nice that it's downloadable without filling a form (even though it should have been the default), a leak was bound to happen. The license is quite restrictive anyway: see RESTRICTIONS on https://forms.gle/jk851eBVbX1m5TAv5

sebzim45003y ago

If someone just decides to use the torrent and ignore those restrictions it might finally establish precident for if you can copyright model weights.

dougmwne3y ago

But even if you could copyright them, once you do some fine-tuning, they are not the same model weights!

1 more reply

unethical_ban3y ago· 2 in thread

Let's say I wanted to use this for... whatever. How do I do it? I bookmarked some "AI for beginners" youtube videos.

No, I'm not trolling. The jargon and the ideas around LLMs is completely foreign to me. I have no idea how they work.

turmeric_root3y ago

clone this and point the script(s) to your downloaded model files: https://github.com/facebookresearch/llama/

electrosphere3y ago

I would like to know too.

rnosov3y ago· 2 in thread

Just to make it clear, does this torrent include model weights?

generalizations3y ago

It contains weights for all four model sizes, apparently. This definitely saves on bandwidth costs. :)

WithinReason3y ago

Folder structure for the 2 smaller models look like this:

    LLAMA
    │   tokenizer.model
    │   tokenizer_checklist.chk
    │
    ├───13B
    │       checklist.chk
    │       consolidated.00.pth
    │       consolidated.01.pth
    │       params.json
    │
    └───7B
            checklist.chk
            consolidated.00.pth
            params.json

1 more reply

Laaas3y ago· 2 in thread

What's the point of the form if it's freely accessible? This might be revolutionary in the LLM field, as Stable Diffusion was to DALL-E.

fwlr3y ago

The user who submitted the pull request is not part of Meta or Facebook Research, and the users who signed off on reviewing the changes don’t appear to be either. I highly doubt Meta will approve the pull request. The models are being distributed by torrent by someone with access to the models, not by Meta themselves as far as we know. They likely still intend to distribute via the form. This is just someone publicizing the torrent link by being cheeky on GitHub.

(As they didn’t reply to my request for the model - I specified it was for personal use and my use case was “I think it would be fun to run it on my own hardware” - I appreciate this little stunt a great deal!)

rnosov3y ago

The linked page is just a pull request, the actual repository readme doesn't mention torrent option at all.

LoveMortuus3y ago· 2 in thread

~220 GB :O

That's quite big!

londons_explore3y ago

Needs ~200GB of graphics ram to run... Not many people will get this running!

2 more replies

9887473y ago

What do you mean "big"? fits on the average laptop :)

madmod3y ago· 1 in thread

Would there be some way to “launder” the model to make it plausibly viable for commercial use? Train a new model with the weights of this model with some kind of noise added to make it hard to tell what it is based on?

ImprobableTruth3y ago

Distillation would be the ideal way (especially because it also has efficiency gains), but as far as I know distillation for LLMs is kinda unproven.

Honestly though, even if you just finetune it, which you will want anyway for any serious commercial application, it's essentially impossible to determine the origin.

1 more reply

xg153y ago· 1 in thread

How horrible! Is there a torrent link so I can be sure to never accidentally download it?

AustinDev3y ago

See github link in OP :p

1 more reply

alfalfasprout3y ago· 1 in thread

Warning: do not use this for commercial purposes. While the weights may be available now, it's a lawsuit waiting to happen if you try to use this at work.

See the original license: "a. Subject to your compliance with the Documentation and Sections 2, 3, and 5, Meta grants you a non-exclusive, worldwide, non-transferable, non-sublicensable, revocable, royalty free and limited license under Meta’s copyright interests to reproduce, distribute, and create derivative works of the Software solely for your non-commercial research purposes. The foregoing license is personal to you, and you may not assign or sublicense this License or any other rights or obligations under this License without Meta’s prior written consent; any such assignment or sublicense will be void and will automatically and immediately terminate this License."

flangola73y ago

And where did I sign my name to that agreement?

2 more replies

Felminor3y ago· 1 in thread

Was to expected.

Anyhow I do remember a post of a person stating this will never happen but it's just a web form and request for describing of what type of research you do

Of course it will be leaked

popcorncowboy3y ago

Yeah, Meta must have had a plan for "when this gets leaked" because they put up only the flimsiest of foils. As per other comments the most likely is simply that they could shield themselves (and plausibly litigate with grounds) while ensuring that the model escapes into the wild to wreak its chaos against MS (OAI) and big G. This way they can see what's what from the safety of their shielded bubble and make a more informed call about changing the license to something more permissive if it looks like the strategic wins against their enemies would be worthwhile. Win win win. (Except for the leaker, that was an unfortunate own goal, they're going down).

EamonnMR3y ago· 1 in thread

Gonna be interesting to see if Facebook tries to tell people they can't use this because it's stolen (when it was presumably built using data taken without permission.)

ed3y ago

Unlike many llm’s this was trained using public training sets (and cited in their paper), to let anyone with the $$$ independently generate the weights

speedylight3y ago· 1 in thread

Legally speaking is it a good idea to download these models this way?

EamonnMR3y ago

My compliance brain says no, but the fact that models get trained with data they obtain without explicit permission makes says that finders keepers would be the relevant case law.

1 more reply

havkom3y ago· 1 in thread

Is this warez?

marginalia_nu3y ago

Sure I'll download

TeamMysticAvengers-meta-llm-x-cars-movie-model-x-angelina-jolie-naked-xxx-2023.zip.exe.torrent

2 more replies

WithinReason3y ago

Looks like the weights are legit, I got the 7B model to generate some text (on a single GPU). Using the 1st prompt from the script it generated this:

[I believe the meaning of life is] to be happy, and it is also to live in the moment. I think that is the most important thing. I'm not really a party girl. I'm not a girl's girl. I have a really small group of close girlfriends and that's all I need. I believe in equal rights for everyone. I'm not a rebel. I don't really rebel against anything. I'm a very traditional girl, very loyal. I'm a mum's girl and I'm a dad's girl. People have a right to know what's going on. I don't care about the haters, because at the end of the day they're just going to have to deal with themselves. I've been getting more and more into fashion since I was about 16. I know I'm a little different, but so what? I think that's good. I don't think you should be like everyone else. It's my birthday, and I'll cry if I want to. I've always been a huge fan of fashion, and I've always liked to dress up

Another one:

[Building a website can be done in 10 simple steps:] 1. Defining Goals 2. Your Branding and Web Presence 3. Defining Your Marketing Strategy 4. Creating Your Website 5. Your Website Design 6. Your Website Development 7. Your Website Launch 8. Your Website’s Content 9. Your Website’s Conversion Rate 10. Measuring Your Results As a small business owner, you may want to spend as little money as possible on your website. But if you want to see a positive ROI, you will need to spend some money. Defining goals is critical when building a website. You should know what you want to accomplish with your website. You need to know what you want your website to achieve. You need to know who you want to convert to a customer. You need to know how you want to reach your goals. You need to know what the timeframe is for your website goals. You need to know what you want to get out of your website. When building a website, you need to clearly define your goals. Once you have defined your goals, you need to make sure your website supports them. If you want to reach your goals, you

4bpp3y ago

Since the point seems to be lost on some of the early commenters, this appears to be a cheeky PR by someone unaffiliated with Facebook, suggesting that they put a magnet link to (what seems to be) a leak of the model weights along with the previously existing invitation to apply to receive them on their own page.

gpm3y ago

Recent comment in this discussion thread of the PR

> looks like some people have been complaining about the link. it will need more seeders before we can merge into main

from someone claiming to be

> Research Scientist at Facebook AI Research. Working on [...]

and who has previously merged pull requests for a repo under https://github.com/facebookresearch

(I'm going to leave their name out of this... because it feels like that comment might come back to bite them)

RobotToaster3y ago

For those who didn't check the github discussion, I don't think this pull request came from a Facebook employee, lol.

VadimPR3y ago

Does this mean that with big enough compute capacity - say, Petals https://github.com/bigscience-workshop/petals which distributes the model over the internet over GPUs - we can run LLAMA?

controversial973y ago

The torrent is 224GB total, a load of 13 to 16GB .pth files

Retr0id3y ago

There seem to be a lot of confused commenters here. This is the content of an as-yet-unmerged pull request, and presumably not something that Facebook approves of.

lopkeny12ko3y ago

If the model is open source, who cares? This is good for the community; no need to go through Meta's opaque approval process.

throw140820203y ago

In case anyone was wondering, the torrent contains 219.01 GiB. More specifically, the 65B parameters, is 121GB, the 30B parameters is 60.59GB, and so on.

mdaniel3y ago

I was expecting it to be a newly created GitHub account, but no, seems they're willing to roll the dice on whatever the outcome is from this

rvnx3y ago

The most logical thing would be for Archive.org to distribute these weights

wunderland3y ago

In case it was unclear, the person who submit the pull request does not work for Facebook and is teasing them here.

Manjuuu3y ago

That guy does not seem to have anything to do with Facebook... interesting.

BeFlatXIII3y ago

Good. Information deserves to be free.

politician3y ago

LLMs invalidate the concept of copyright to such a degree that I find it impossible to see this torrent as theft.

CapShoyo123y ago

I'm excited, but having trouble running Llama on my local machine, has anyone managed this?

binarymax3y ago

Has anyone managed to download this yet using the magnet link? Is it well seeded?

transitivebs3y ago

Seeding...

fiat_fandango3y ago

I wonder if anyone is legitimately concerned that mirrored downloads might contain malicious payloads?

m3kw93y ago

Till someone puts up a site to test it

hsuduebc23y ago

Can you point me a way to run this locally please?

aghack3y ago

Can this be finetuned?

happycube3y ago

Who here didn't see this leak coming?

Madmallard3y ago

There's pretty much no point in downloading this right? It cannot be run with any fidelity on any consumer end gpu

zoranzv3y ago

Good app

progbloging3y ago

cool

onetokeoverthe3y ago

Good thing ive had decades of real relationships and sex.

Knew the net would probably squash print and privacy the first minute i logged into aol.

Who knew it would breed a generation of robot loving losers?

j / k navigate · click thread line to collapse