Open source AI is the path forward

887 comments

dang1y ago

Related ongoing thread:

Llama 3.1 - https://news.ycombinator.com/item?id=41046540 - July 2024 (114 comments)

“The Heavy Press Program was a Cold War-era program of the United States Air Force to build the largest forging presses and extrusion presses in the world.” This ”program began in 1944 and concluded in 1957 after construction of four forging presses and six extruders, at an overall cost of $279 million. Six of them are still in operation today, manufacturing structural parts for military and commercial aircraft” [1].

$279mm in 1957 dollars is about $3.2bn today [2]. A public cluster of GPUs provided for free to American universities, companies and non-profits might not be a bad idea.

[1] https://en.m.wikipedia.org/wiki/Heavy_Press_Program

[2] https://data.bls.gov/cgi-bin/cpicalc.pl?cost1=279&year1=1957...

epaulson1y ago

The National Science Foundation has been doing this for decades, starting with the supercomputing centers in the 80s. Long before anyone talked about cloud credits, NSF has had a bunch of different programs to allocate time on supercomputers to researchers at no cost, these days mostly run out of the Office of Advanced Cyberinfrastruture. (The office name is from the early 00s) - https://new.nsf.gov/cise/oac

(To connect universities to the different supercomputing centers, the NSF funded the NSFnet network in the 80s, which was basically the backbone of the Internet in the 80s and early 90s. The supercomputing funding has really, really paid off for the USA)

JumpCrisscross1y ago

> NSF has had a bunch of different programs to allocate time on supercomputers to researchers at no cost, these days mostly run out of the Office of Advanced Cyberinfrastruture

This would be the logical place to put such a programme.

1 more reply

jszymborski1y ago

As you've rightly pointed out, we have the mechanism, now let's fund it properly!

I'm in Canada, and our science funding has likewise fallen year after year as a proportion of our GDP. I'm still benefiting from A100 clusters funded by tax payer dollars, but think of the advantage we'd have over industry if we didn't have to fight over resources.

1 more reply

cmdrk1y ago

Yeah, the specific AI/ML-focused program is NAIRR.

https://nairrpilot.org/

Terrible name unless they low-key plan to make AI researchers' hair fall out.

dastbe1y ago

the US already pays for 2+ aws region for cia/dod. why not pay for a region that is only available to researchers?

CardenB1y ago

Doubtful that GPUs purchased today would be in use for a similar time scale. Govt investment would also drive the cost of GPUs up a great deal.

Not sure why a publicly accessible GPU cluster would be a better solution than the current system of research grants.

ygjb1y ago

Of course they won't. The investment in the Heavy Press Program was the initial build, and just citing one example, the Alcoa 50,000 ton forging press was built in 1955, operated until 2008, and needed ~$100M to get it operational again in 2012.

The investment was made to build the press, which created significant jobs and capital investment. The press, and others like it, were subsequently operated by and then sold to a private operator, which in turn enabled the massive expansion of both military manufacturing, and commercial aviation and other manufacturing.

The Heavy Press Program was a strategic investment that paid dividends by both advancing the state of the art in manufacturing at the time it was built, and improving manufacturing capacity.

A GPU cluster might not be the correct investment, but a strategic investment in increasing, for example, the availability of training data, or interoperability of tools, or ease of use for building, training, and distributing models would probably pay big dividends.

3 more replies

JumpCrisscross1y ago

> Doubtful that GPUs purchased today would be in use for a similar time scale

Totally agree. That doesn't mean it can't generate massive ROI.

> Govt investment would also drive the cost of GPUs up a great deal

Difficult to say this ex ante. On its own, yes. But it would displace some demand. And it could help boost chip production in the long run.

> Not sure why a publicly accessible GPU cluster would be a better solution than the current system of research grants

Those receiving the grants have to pay a private owner of the GPUs. That gatekeeping might be both problematic, if there is a conflict of interests, and inefficient. (Consider why the government runs its own supercomputers versus contracting everything to Oracle and IBM.)

2 more replies

jvanderbot1y ago

A much better investment would be to (somehow) revolutionize production of chips for AI so that it's all cheaper, more reliable, and faster to stand up new generations of software and hardware codesign. This is probably much closer to the program mentioned in the top level comment: It wasn't to produce one type of thing, but to allow better production of any large thing from lighter alloys.

photonthug1y ago

> Not sure why a publicly accessible GPU cluster would be a better solution than the current system of research grants.

You mean a better solution than different teams paying AWS over and over, potentially spending 10x on rent rather than using all that cash as a down payment on actually owning hardware? I can't really speak for the total costs of depreciation/hardware maintenance but renting forever isn't usually a great alternative to buying.

2 more replies

carschno1y ago

In the Netherlands, for instance, there is "the national supercomputer" Snellius: https://www.surf.nl/en/services/snellius-the-national-superc... I am not sure about its budget, but my impression as a user is that its resources are never fully used. At least, none of my jobs ever had to queue. I doubt that it can compete with the scale of resources that FAANG companies have available, but then again, I also doubt how research would benefit.

Sure, academia could build LLMs, and there is at least one large-scale project for that: https://gpt-nl.com/ On the other hand, this kind of models still need to demonstrate specific scientific value that goes beyond using a chatbot for generating ideas and summarizing documents.

So I fully agree that the research budget cuts in the past decades have been catastrophic, and probably have contributed to all the disasters the world is currently facing. But I think that funding prestigious super-projects is not the best way to spend funds.

2 more replies

ks20481y ago

How about using some of that money to develop CUDA alternatives so everyone is not paying the Nvidia tax?

dogcomplex1y ago

Or just develop the next wave of chips designed for specifically transformer-based architectures (and ternary computing), and bypass the needs for GPUs and CUDA altogether

1 more reply

lukan1y ago

It would be probably cheaper to negate some IP. There are quite some projects and initiatives to make CUDA code run on AMD for example, but as far as I know, they all stopped at some point, probably because of fear of being sued into oblivion.

latchkey1y ago

It is being done already...

https://docs.scale-lang.com/

whimsicalism1y ago

It seems like rocm is already fully ready for transformer inference, so you are just referring to training?

1 more reply

erickj1y ago

That's the kind of work that can come out of academia and open source communities when societies provide the resources required.

belter1y ago

Please start with the Windows Tax first for Linux users buying hardware...and the Apple Tax for Android users...

zitterbewegung1y ago

Either you port Tensorflow (Apple)[1] or PyTorch to your platform or you allow CUDA to run on your hardware (AMD) [2]. Companies are incentives to not have NVIDIA having a monopoly but the thing is that CUDA is a huge moat due to compatibility of all frameworks and everyone knows it. Also, all of the cloud or on premises providers use NVIDIA regardless.

[1] https://developer.apple.com/metal/tensorflow-plugin/ [2] https://www.xda-developers.com/nvidia-cuda-amd-zluda/

1 more reply

light_hue_11y ago

The problem is that any public cluster would be outdated in 2 years. At the same time, GPUs are massively overpriced. Nvidia's profit margins on the H100 are crazy.

Until we get cheaper cards that stand the test of time, building a public cluster is just a waste of money. There are far better ways to spend $1b in research dollars.

JumpCrisscross1y ago

> any public cluster would be outdated in 2 years

The private companies buying hundreds of billions of dollars of GPUs aren't writing them off in 2 years. They won't be cutting edge for long. But that's not the point--they'll still be available.

> Nvidia's profit margins on the H100 are crazy

I don't see how the current practice of giving a researcher a grant so they can rent time on a Google cluster that runs H100s is more efficient. It's just a question of capex or opex. As a state, the U.S. has a structual advantage in the former.

> far better ways to spend $1b in research dollars

One assumes the U.S. government wouldn't be paying list price. In any case, the purpose isn't purely research ROI. Like the heavy presses, it's in making a prohibitively-expensive capital asset generally available.

ninininino1y ago

What about dollar cost averaging your purchases of GPUs? So that you're always buying a bit of the newest stuff every year rather than just a single fixed investment in hardware that will become outdated? Say 100 million a year every year for 20 years instead of 2 billion in a single year?

varenc1y ago

I just watched this 1950s DoD video on the heavy press program and highly recommend it: https://www.youtube.com/watch?v=iZ50nZU3oG8

1 more reply

fweimer1y ago

Don't these public clusters exist today, and have been around for decades at this point, with varying architectures? In the sense that you submit a proposal, it gets approved, and then you get access for your research?

NewJazz1y ago

This is the most recent iteration of a national platform. They have tons of GPUs (and CPUs, and flash storage) hooked up as a Kubernetes cluster, available for teaching and research.

https://nationalresearchplatform.org/

JumpCrisscross1y ago

Not--to my knowledge--for the GPUs necessary to train cutting-edge LLMs.

1 more reply

prpl1y ago

Great idea, too bad the DOE and NSF were there first.

cyanydeez1y ago

Better idea would be to make various open source packages utilities and put maintainers everywhere funded by public good.

AI is a fad, the brick and mortar of the future is open source tools.

bayindirh1y ago

> A public cluster of GPUs provided for free to American universities, companies and non-profits might not be a bad idea.

USA and Europe is already doing that in a grand scale, in different forms. Both at national and international scale.

I work at an HPC center which provides servers nationally and collaborates on international level.

B4CKlash1y ago

Eric Schmidt advocated for this exact thing in an Op-ed piece in the latest MIT Technology Review.

[1] https://www.technologyreview.com/2024/05/13/1092322/why-amer...

spullara1y ago

It makes much more sense to invest in a next generation fab for GPUs than to buy GPUs and more closely matches this kind of project.

1 more reply

whartung1y ago

Now, I have no idea.

How much capability would $3.2bn in terms of AI computing power provide, including the operational and power costs of the cluster?

Certainly, you could build a "$3.2bn GPU cluster", but it would be dark.

So, how much learning time would $3.2bn provide? 1 year? 10 years?

Just curious about hand wavy guesses. I have no idea the scope of the these clusters.

kjkjadksj1y ago

The size of the cluster would have to be massive or else your job will be on the queue for a year. And even then what are you going to do downsize the resources requested so you can get in earlier? After a certain point it starts to make more sense to just buy your own xeons and run your own cluster.

rkique1y ago

Very much in this spirit is the NSF-funded National Deep Inference Fabric, which lets researchers run remote experiments on foundation models: https://ndif.us. They just announced a pilot program for Llama405b!

goda901y ago

I'd like to see big programs to increase the amount of cheap, clean energy we have. AI compute would be one of many beneficiaries of super cheap energy, especially since you wouldn't need to chase newer, more efficient hardware just to keep costs down.

Melatonic1y ago

Yeah this would be the real equivalent of the program people are talking about above. That an investing in core networking infrastructure (like cables) instead of just giving huge handouts to certain corporations that then pocket the money.....

fintler1y ago

For the DoE, take a look at:

https://doeleadershipcomputing.org/

blackeyeblitzar1y ago

What about distributed training on volunteer hardware? Is that feasible?

oersted1y ago

It is an exciting concept, there's a huge wealth of gaming hardware deployed that is inactive at most hours of the day. And I'm sure people are willing to pay well above the electricity cost for it.

Unfortunately, the dominant LLM architecture makes it relatively infeasible right now.

- Gaming hardware has too limited VRAM for training any kind of near-state-of-the-art model. Nvidia is being annoyingly smart about this to sell enterprise GPUs at exorbitant markups.

- Right now communication between machines seems to be the bottleneck, and this is way worse with limited VRAM. Even with data-centre-grade interconnect (mostly Infiniband, which is also Nvidia, smart-asses), any failed links tend to cause big delays in training.

Nevertheless, it is a good direction to push towards, and the government could indeed help, but it will take time. We need both a more healthy competitive landscape in hardware, and research towards model architectures that are easy to train in a distributed manner (this was also the key to the success of Transformers, but we need to go further).

1 more reply

codemusings1y ago

Ever heard of SETI@home?

https://setiathome.berkeley.edu

1 more reply

Aperocky1y ago

Imagine if they made a data center with 1957 electronics that cost $279 million.

They probably won't be using it now because the phone in your pocket is likely more powerful. Moore law did end but data center stuff are still evolving order of magnitudes faster than forging presses.

BigParm1y ago

So we'll have the government bypass markets and force the working class to buy toys for the owning class?

If anything, allocate compute to citizens.

_fat_santa1y ago

> If anything, allocate compute to citizens.

If something like this were to become a reality, I could see something like "CitizenCloud" where once you prove that you are a US Citizen (or green card holder or some other requirement), you can then be allocated a number of credits every month for running workloads on the "CitizenCloud". Everyone would get a baseline amount, from there if you can prove you are a researcher or own a business related to AI then you can get more credits.

aiauthoritydev1y ago

Overall government doing anything is a bad idea. There are cases however where government is the only entity that can do certain things. These are things that involve military, law enforcement etc. Outside of this we should rely on private industry and for-profit industry as much as possible.

pavlov1y ago

The American healthcare industry demonstrates the tremendous benefits of rigidly applying this mindset.

Why couldn’t law enforcement be private too? You call 911, several private security squads rush to solve your immediate crime issue, and the ones who manage to shoot the suspect send you a $20k bill. Seems efficient. If you don’t like the size of the bill, you can always get private crime insurance.

1 more reply

the8thbit1y ago

"Eventually though, open source Linux gained popularity – initially because it allowed developers to modify its code however they wanted ..."

I find the language around "open source AI" to be confusing. With "open source" there's usually "source" to open, right? As in, there is human legible code that can be read and modified by the user? If so, then how can current ML models be open source? They're very large matrices that are, for the most part, inscrutable to the user. They seem akin to binaries, which, yes, can be modified by the user, but are extremely obscured to the user, and require enormous effort to understand and effectively modify.

"Open source" code is not just code that isn't executed remotely over an API, and it seems like maybe its being conflated with that here?

causal1y ago

"Open weights" is a more appropriate term but I'll point out that these weights are also largely inscrutable to the people with the code that trained it. And for licensing reasons, the datasets may not be possible to share.

There is still a lot of modifying you can do with a set of weights, and they make great foundations for new stuff, but yeah we may never see a competitive model that's 100% buildable at home.

Edit: mkolodny points out that the model code is shared (under llama license at least), which is really all you need to run training https://github.com/meta-llama/llama3/blob/main/llama/model.p...

stavros1y ago

"Open weights" means you can use the weights for free (as in beer). "Open source" means you get the training dataset and the methodology. ~Nobody does open source LLMs.

9 more replies

aerzen1y ago

LLAMA is an open-weights model. I like this term, let's use that instead of open source.

1 more reply

ab5tract1y ago

If you can’t share the dataset, under what twisted reality are you fine to share the derivative models based on those unsharable datasets?

In a better world, there would be no “I ran some algos on it and now it’s mine” defense.

1 more reply

yangcheng1y ago

latest llama 3.1 is in a different repo, https://github.com/meta-llama/llama-models/blob/main/models/... , but yes, the code is shared. It astonishing that in software 2.0 era, powerful applications like llama has only hundreds of lines of code, and most work hidden in training data. Source code alone is no longer that informative as Software 1.0

danielrhodes1y ago

For models of this size, the code used to train them is going to be very custom to the architecture/cluster they are built on. It would be almost useless to anybody outside of Meta. The dataset would be more a lot more interesting, as it would at the very least show everybody how they got it to behave in certain ways.

twelvechairs1y ago

Open training data would be great too.

If you have open data and open source code you can reproduce the weights

1 more reply

ajxlasA1y ago

Really? I have to check out the training code again. Last time I looked the training and inference code were just example toys that were barely usable.

Has that changed?

input_sh1y ago

Open Source Initiative (kind of a de-facto authority on what's open source and what not) is spending a whole lot of time figuring out what it means for an AI system to be open source. In other words, they're basically trying to come up with a new license because the existing ones can't easily apply.

I believe this is the current draft: https://opensource.org/deepdive/drafts/the-open-source-ai-de...

downWidOutaFite1y ago

OSI made themselves the authority because they hated Richard Stallman and his Free Software movement. It's just marketing.

2 more replies

halflings1y ago

Training code is only useful to people in academia, and the closest thing to "code you can modify" are open weights.

People are framing this as if it was an open-source hierarchy, with "actual" open-source requiring all training code to be shared. This is not obvious to me, as I'm not asking people that share open-source libraries to also share the tools they used to develop them. I'm also not asking them to share all the design documents/architecture discussion behind this software. It's sufficient that I can take the end result and reshape it in any way I desire.

This is coming from an LLM practitioner that finetunes models for a living; and this constant debate about open-source vs open-weights seems like a huge distraction vs the impact open-sourcing something like Llama has... this is truly a Linux-like moment. (at a much smaller scale of course, for now at least)

1 more reply

Zambyte1y ago

> If so, then how can current ML models be open source?

The source of a language model is the text it was trained on. Llama models are not open source (contrary to their claims), they are open weight.

moffkalast1y ago

You can find the entire Llama 3.0 pretraining set here: https://huggingface.co/datasets/HuggingFaceFW/fineweb

15T tokens, 45 terrabytes. Seems fairly open source to me.

2 more replies

root_axis1y ago

No. The text is an asset used by the source to train the model. The source can process arbitrary text. Text is just text, it was written for communication purposes, software (defined by source code) processes that text in a particular way to train a model.

1 more reply

thayne1y ago

I think it would also include the code used to train it

1 more reply

GuB-421y ago

I like the term "open weights". Open source would be the dataset and code that generates these weights.

There is still a lot you can do with weights, like fine tuning, and it is arguably more useful as retraining the entire model would cost millions in compute.

shdjkKA1y ago

Of course you are right, I'd put it less carefully: The quoted Linux line is deceptive marketing.

- If we start with the closed training set, that is closed and stolen, so call it Stolen Source.

- What is distributed is a bunch of float arrays. The Llama architecture is published, but not the training or inference code. Without code there is no open source. You can as well call a compiler book open source, because it tells you how to build a compiler.

Pure marketing, but predictably many people follow their corporate overlords and eagerly adopt the co-opted terms.

Reminder again that FB is not releasing this out of altruism, but because they have an existing profitable business model that does not depend on generated chats. They probably do use it internally for tracking and building profiles, but that is the same as using Linux internally, so they release the weights to destroy the competition.

Isn't price dumping an anti trust issue?

mkolodny1y ago

Llama’s code is open source: https://github.com/meta-llama/llama3/blob/main/llama/model.p...

Flimm1y ago

No, it's not. The Llama 3 Community License Agreement is not an open source license. Open source licenses need to meet the criteria of the only widely accepted definition of "open source", and that's the one formulated by the OSI [0]. This license has multiple restrictions on use and distribution which make it not open source. I know Facebook keeps calling this stuff open source, maybe in order to get all the good will that open source branding gets you, but that doesn't make it true. It's like a company calling their candy vegan while listing one its ingredients as pork-based gelatin. No matter how many times the company advertises that their product is vegan, it's not, because it doesn't meet the definition of vegan.

[0] - https://opensource.org/osd

3 more replies

mesebrec1y ago

This is like saying any python program is open source because the python runtime is open source.

Inference code is the runtime; the code that runs the model. Not the model itself.

1 more reply

apsec1121y ago

That's not the training code, just the inference code. The training code, running on thousands of high-end H100 servers, is surely much more complex. They also don't open-source the dataset, or the code they used for data scraping/filtering/etc.

1 more reply

blackeyeblitzar1y ago

That is just the inference code. Not training code or evaluation code or whatever pre/post processing they do.

1 more reply

bilsbie1y ago

Can’t you do fine tuning on those binaries? That’s a modification.

the8thbit1y ago

You can fine tune the models, and you can modify binaries. However, there is no human readable "source" to open in either case. The act of "fine tuning" is essentially brute forcing the system to gradually alter the weights such that loss is reduced against a new training set. This limits what you can actually do with the model vs an actual open source system where you can understand how the system is working and modify specific functionality.

Additionally, models can be (and are) fine tuned via APIs, so if that is the threshold required for a system to be "open source", then that would also make the GPT4 family and other such API only models which allow finetuning open source.

3 more replies

beloch1y ago

It's no secret that implementing AI usually involves far more investment into training and teaching than actual code. You can know how a neural net or other ML model works. You can have all the code before you. It's still a huge job (and investment) to do anything practical with that. If Meta shares the code their AI runs on with you, you're not going to be able to do much with it unless you make the same investment in gathering data and teaching to train that AI. That would probably require data Meta won't share. You'd effectively need your own Facebook.

If everyone open sources their AI code, Meta can snatch the bits that help them without much fear of helping their direct competitors.

1 more reply

bjornsing1y ago

The term “source code” can mean many things. In a legal context it’s often just defined as the preferred format for modification. It can be argued that for artificial neural networks that’s the weights (along with code and preferably training data).

kashyapc1y ago

I agree; there's a lot of muddiness in the term "open source AI". Earlier this year there was a talk[1] at FOSDEM, titled "Moving a step closer to defining Open Source AI". It is from someone at the Open Source Initiative. The video and slides are available in the link below[1]. From the abstract:

"Finding an agreement on what constitutes Open Source AI is the most important challenge facing the free software (also known as open source) movement. European regulation already started referring to "free and open source AI", large economic actors like Meta are calling their systems "open source" despite the fact that their license contain restrictions on fields-of-use (among other things) and the landscape is evolving so quickly that if we don't keep up, we'll be irrelevant."

[1] https://fosdem.org/2024/schedule/event/fosdem-2024-2805-movi... defining-open-source-ai/

rbits1y ago

You release all the technology and the training data. Everything that was used to create the model, including instructions.

I'm not sure if facebook has done that

szundi1y ago

Open source = reproducible binaries (weights) by you on your computer, IMO.

Strategy of FB is that they are good to be a user only and fine ruining competitor’s business with good enough free alternatives while collecting awards as saviors of whatever.

2 more replies

orthoxerox1y ago

Open training dataset + open steps sufficient to train exactly the same model.

the8thbit1y ago

This isn't what Meta releases with their models, though I would like to see more public training data. However, I still don't think that would qualify as "open source". Something isn't open source just because its reproducible out of composable parts. If one, very critical and system defining part is a binary (or similar) without publicly available source code, then I don't think it can be said to be "open source". That would be like saying that Windows 11 is open source because Windows Calculator is open source, and its a component of Windows.

2 more replies

Yizahi1y ago

They can't release training dataset if it was illegally scrapped all over the web without permission :) (taps head)

langcss1y ago

Coming up with the words and concepts to describe the models is a challenge.

Does the training data require permission from the copyright holder to use? Are the weights really open source or more like compiled assembly?

jsheard1y ago

I also think that something like Chromium is a better analogy for corporate open source models than a grassroots project like Linux is. Chromium is technically open source, but Google has absolute control over the direction of it's development and realistically it's far too complex to maintain a fork without Googles resources, just like Meta has complete control over what goes into their open models, and even if they did release all the training data and code (which they don't) us mere plebs could never afford to train a fork from scratch anyway.

skybrian1y ago

I think you’re right from the perspective of an individual developer. You and I are not about to fork Chromium any time soon. If you presume that forking is impractical then sure, the right to fork isn’t worth much.

But just because a single developer couldn’t do it doesn’t mean it couldn’t be done. It means nobody has organized a large enough effort yet.

For something like a browser, which is critical for security, you need both the organization and the trust. Despite frequent criticism, Mozilla (for example) is still considered pretty trustworthy in a way that an unknown developer can’t be.

1 more reply

rmbyrro1y ago

If you think about LLMs as a new kind of programming runtime, the matrices are the source.

stale20021y ago

Ok call it Open Weights then if the dictionary definitions matter so much to you.

The actual point that matters is that these models are available for most people to use for a lot of stuff, and this is way way better than what competitors like OpenAI offer.

the8thbit1y ago

They don't "[allow] developers to modify its code however they want", which is a critical component of "open source", and one that Meta is clearly trying to leverage in branding around its products. I would like them to start calling these "public weight models", because what they're doing now is muddying the waters so much that "open source" now just means providing an enormous binary and an open source harness to run it in, rather than serving access to the same binary via an API.

3 more replies

gorgoiler1y ago

One counterpoint is that major publications (eg New York Times) would have you believe that AI is a mildly lossy compression algorithm capable of reconstructing the original source material.

2 more replies

seoulmetro1y ago

Unfortunately open source really just means an open API these days. The API is heavily intertwined with closed source.

roguas1y ago

No, open source means that sources are open, typically for inspection, modification etc. Also here it can be considered the case. Likely in order to claim "true open source", they would have to share dataset? But even this might not be enough for truely open source model? This dataset is nothing but another artifact. So how did they arrive at this dataset, now they have to share pipelines and infra...

.. the thing is, we have not dealt with llm much, it's hard to say what can be considered open source llm just yet, so we use that as metaphore for now

nothrowaways1y ago

Weight is the new code.

1 more reply

candiddevmike1y ago

None of Meta's models are "open source" in the FOSS sense, even the latest Llama 3.1. The license is restrictive. And no one has bothered to release their training data either.

This post is an ad and trying to paint these things as something they aren't.

JumpCrisscross1y ago

> no one has bothered to release their training data

If the FOSS community sets this as the benchmark for open source in respect of AI, they're going to lose control of the term. In most jurisdictions it would be illegal for the likes of Meta to release training data.

3 more replies

blackeyeblitzar1y ago

AI2 has released training data in their OLMo model: https://blog.allenai.org/hello-olmo-a-truly-open-llm-43f7e73...

hubraumhugo1y ago

The big winners of this: devs and AI startups

- No more vendor lock-in

- Instead of just wrapping proprietary API endpoints, developers can now integrate AI deeply into their products in a very cost-effective and performant way

- Price race to the bottom with near-instant LLM responses at very low prices are on the horizon

As a founder, it feels like a very exciting time to build a startup as your product automatically becomes better, cheaper, and more scalable with every major AI advancement. This leads to a powerful flywheel effect: https://www.kadoa.com/blog/ai-flywheel

boringg1y ago

- Price race to the bottom with near-instant LLM responses at very low prices are on the horizon

Maybe a big price war while the market majors fight out for positioning but they still need to make money off their investments so someone is going to have to raise prices at some point and youll be locked into their system if you build on it.

2 more replies

choppaface1y ago

Also the opportunity to run on user compute and on private data. That supports a slate of business models that are incompatible with the mainframe approach.

Including adtech models, which are predominantly cloud-based.

danielmarkbruce1y ago

It creates the opposite of a flywheel effect for you. It creates a leapfrog effect.

1 more reply

drcode1y ago

and Xi Jingping

war3211y ago

Even if it's just open weights and not "true" open source, I'll still give Meta the appreciation of being one of the few big AI companies actually committed to open models. In an ecosystem where groups like Anthropic and OpenAI keep hemming and hawing about safety and the necessity of closed AI systems "for our sake", they stand out among the rest.

loceng1y ago

To me it will be most interesting to see who attempts to manipulate the models by stuffing them with content, essentially adding "duplicate" content such as via tautology, in order to make it have added-misallocated weight; which I don't think an AI model will automatically be able to determine, unless it was truly intelligent, instead it would require to be trained by competent humans.

And so the models that have mechanisms for curating and preventing such misapplied weighting, and then the organizations and individuals who accurately create adjustments to the models, will in the end be the winners - where truth has been more honed for.

meowtimemania1y ago

Why would openai/anthropic's approach be more safe? Are people able to remove all the guard rails on the llama models?

2 more replies

light_triad1y ago

They are positioning themselves as champions of AI open source mostly because they were blindsided by OpenAI, are not in the infra game, and want to commoditize their complements as much as possible.

This is not altruism although it's still great for devs and startups. All FB GPU investments is primarily for new AI products "friends", recommendations and selling ads.

https://www.joelonsoftware.com/2002/06/12/strategy-letter-v/

baby1y ago

Meta does a good thing

HN spends a day figuring out how it’s actually bad

5 more replies

war3211y ago

They've been working on AI for a good bit now. Open source especially is something they've championed since the mid 2010s at least with things like PyTorch, GraphQL, and React. It's not something they've suddenly pivoted to since ChatGPT came in 2022.

kertoip_11y ago

They are giving it "for free" because:

* they need LLMs that they can control for features on their platforms (Fb/Instagram, but I can see many use cases on VR too)

* they cannot sell it. They have no cloud services to offer.

So they would spend this money anyways, but to compensate some losses they just decided to use it to fix their PR by contenting developers

1 more reply

sebastiennight1y ago

I think people massively underestimate how much time/attention span (and ad revenue) will be up for grabs once a platform really nails the "AI friend" concept. And it makes sense for Meta to position themselves for it.

1 more reply

Havoc1y ago

> they were blindsided by OpenAI

Given the mountain of GPUs they bought at precisely the right moment I don't think that's entirely accurate

1 more reply

istjohn1y ago

AI is not a "complement" of a social network in the way Spolsky defines the term.

> A complement is a product that you usually buy together with another product. Gas and cars are complements. Computer hardware is a classic complement of computer operating systems. And babysitters are a complement of dinner at fine restaurants. In a small town, when the local five star restaurant has a two-for-one Valentine’s day special, the local babysitters double their rates. (Actually, the nine-year-olds get roped into early service.)

> All else being equal, demand for a product increases when the prices of its complements decrease.

Smart phones ar a complement of Instagram. VR headsets are a complement of the metaverse. AI could be a component of a social network, but it's not a complement.

brigadier1321y ago

Intentions are overrated. Given how many people with good intentions fuck up everything, I'd rather have actual results, even if the intention is self-serving.

pera1y ago

I wish Meta stopped using the "open source" misnomer for free of charge weights. In the US the FTC already uses the term Open-Weights, and it seems the industry is also adopting this term (e.g. Mistral).

Someone can correct me here but AFAIK we don't even know which datasets are used to train these models, so why should we even use "open" to describe Llama? This is more similar to a freeware than an open-source project.

[1] https://www.ftc.gov/policy/advocacy-research/tech-at-ftc/202...

2 more replies

bun_at_work1y ago

Meta makes their money off advertising, which means they profit from attention.

This means they need content that will grab attention, and creating open source models that allow anyone to create any content on their own becomes good for Meta. The users of the models can post it to their Instagram/FB/Threads account.

Releasing an open model also releases Meta from the burden of having to police the content the model generates, once the open source community fine-tunes the models.

Overall, this move is good business move for Meta - the post doesn't really talk about the true benefit, instead moralizing about open source, but this is a sound business move for Meta.

apwell231y ago

I am not sure I follow this.

1. Is there such a thing as 'attention grabbing AI content' ? Most AI content I see is the opposite of 'attention grabbing'. Kindle store is flooded with this garbage and none of it is particularly 'attention grabbing'.

2. Why would creation of such content, even if it was truly attention grabbing, benefit meta in particular ?

3. How would poliferation of AI content lead to more ad spend in the economy. Ad budgets won't increase because of AI content?

To me this is typical Zuckerberg play. Attach metas name to whatever is trendy at the moment like ( now forgotten) metaverse, cryptocoins and bunch of other failed stuff that was trendy for a second. Meta is NOT an Gen AI company ( or a metaverse company, or a cypto company) like he is scamming ( more like colluding) the market to believe. A mere distraction from slowing user growth on ALL of meta apps.

ppl seem to have just forgotten this https://en.wikipedia.org/wiki/Diem_(digital_currency)

1 more reply

modo_mario1y ago

I think the biggest part of it is just that they were behind but also betting on it. This allowed them to get a lot of traction, support and be a notable player in the race whilst still retaining some control. Chances are if someone is going to have a frontrow seat monetizing this it's still them.

natural2191y ago

AI moderators too would be an enormous boon if they could get that right.

KaiserPro1y ago

It would be good, but the cost per moderation is still really high for it to be practical.

noiseinvacuum1y ago

Creating content with AI will surely be helpful for social media to some extent but I think it's not that important in larger scheme of things, there's already a vast sea of content being created by humans and differentiation is already in recommending the right content to right people at right time.

More important is the products that Meta will be able to make if the industry standardizes on Llama. They would have the front seat in not just with access the latest unreleased models but also settings the direction of progress and next gen LLM optimizes for. If you're Twitter or Snap or TikTok or compete with Meta on the product then good luck in trying to keep up.

visarga1y ago

> Meta makes their money off advertising, which means they profit from attention. This means they need content that will grab attention

That is why they hopped on the Attention is All You Need train

jklinger4101y ago

This is a great point. Eventually, META will only allow LLAMA generated visual AI content on its platforms. They'll put a little key in the image that clears it with the platform.

Then all other visual AI content will be banned. If that is where legislation is heading.

rybosworld1y ago

Huge companies like facebook will often argue for solutions that on the surface, seem to be in the public interest.

But I have strong doubts they (or any other company) actually believe what they are saying.

Here is the reality:

- Facebook is spending untold billions on GPU hardware.

- Facebook is arguing in favor of open sourcing the models, that they spent billions of dollars to generate, for free...?

It follows that companies with much smaller resources (money) will not be able to match what Facebook is doing. Seems like an attempt to kill off the competition (specifically, smaller organizations) before they can take root.

mattnewton1y ago

I actually think this is one of the rare times where the small guys interests are aligned with Meta. Meta is scared of a world where they are locked out of LLM platforms, one where OpenAI gets to dictate rules around their use of the platform much like Apple and Google dictates rules around advertiser data and monetization on their mobile platforms. Small developers should be scared of a world where the only competitive LLMs are owned by those players too.

Through this lense, Meta’s actions make more sense to me. Why invest billions in VR/AR? The answer is simple, don’t get locked out of the next platform, maybe you can own the next one. Why invest in LLMs? Again, don’t get locked out. Google and OpenAi/Microsoft are far larger and ahead of Meta right now and Meta genuinely believes the best way to make sure they have an LLM they control is to make everyone else have an LLM they can control. That way community efforts are unified around their standard.

mupuff12341y ago

Sure, but don't you think the "not getting locked out" is just the pre-requisite for their eventual goal of locking everyone else out?

2 more replies

myaccountonhn1y ago

> I actually think this is one of the rare times where the small guys interests are aligned with Meta

Small guys are the ones being screwed over by AI companies and having their text/art/code stolen without any attribution or adherence to license. I don’t think Meta is on their side at all

1 more reply

Salgat1y ago

The reason for Meta making their model open source is rather simple: They receive an unimaginable amount of free labor, and their license only excludes their major competitors to ensure mass adoption without benefiting their competition (Microsoft, Google, Alibaba, etc). Public interest, philanthropy, etc are just nice little marketing bonuses as far as they're concerned (otherwise they wouldn't be including this licensing restriction).

noiseinvacuum1y ago

All correct, Meta does obviously benefit.

It's helpful to also look at what do the developers and companies (everyone outside of top 5/10 big tech companies) get out of this. They get open access to weights of SOTA LLM models that take billions of dollars to train and 10s of billions a year to run the AI labs that make these. They get the freedom to fine tune them, to distill them, and to host them on their own hardware in whatever way works best for their products and services.

1 more reply

frabcus1y ago

Meta haven't made an open source model. They have released a binary with a proprietary but relatively liberal license. Binaries are not source and their license isn't free.

KaiserPro1y ago

The model it's self isn't actually that valuable to facebook. The thing that's important is the dataset, the infrastructure and the people to make the models.

There is still, just about, a strong ethos( especially in the research teams) to chuck loads of stuff over the wall into opensource. (pytorch, detectron, SAM, aria etc)

but its seen internally as a two part strategy:

1) strong recruitment tool (come work with us, we've done cool things, and you'll be able to write papers)

2) seeding the research community with a common toolset.

ketzo1y ago

Meta is, fundamentally, a user-generated-content distribution company.

Meta wants to make sure they commoditize their complements: they don’t want a world where OpenAI captures all the value of content generation, they want the cost of producing the best content to be as close to free as possible.

chasd001y ago

i was thinking along the same. A lot of content generated by LLMs is going to end up on Facebook or Instagram. The easier it is to create AI generated content the more content ends up on those applications.

Nesco1y ago

Especially because genAI is a copyright laundering system. You can train it on copyrighted material and none of the content generated with it are copyright-able, which is perfect for social apps

cs7021y ago

> We’re releasing Llama 3.1 405B, the first frontier-level open source AI model, as well as new and improved Llama 3.1 70B and 8B models.

Bravo! While I don't agree with Zuck's views and actions on many fronts, on this occasion I think he and the AI folks at Meta deserve our praise and gratitude. With this release, they have brought the cost of pretraining a frontier 400B+ parameter model to ZERO for pretty much everyone -- well, everyone except Meta's key competitors.[a] THANK YOU ZUCK.

Meanwhile, the business-minded people at Meta surely won't mind if the release of these frontier models to the public happens to completely mess up the AI plans of competitors like OpenAI/Microsoft, Google, Anthropic, etc. Come to think of it, the negative impact on such competitors was likely a key motivation for releasing the new models.

---

[a] The license is not open to the handful of companies worldwide which have more than 700M users.

advael1y ago

Look, absolutely zero people in the world should trust any tech company when they say they care about or will keep commitments to the open-source ecosystem in any capacity. Nevertheless, it is occasionally strategic for them to do so, and there can be ancillary benefits for said ecosystem in those moments where this is the best play for them to harm their competitors

For now, Meta seems to release Llama models in ways that don't significantly lock people into their infrastructure. If that ever stops being the case, you should fork rather than trust their judgment. I say this knowing full well that most of the internet is on AWS or GCP, most brick and mortar businesses use Windows, and carrying a proprietary smartphone is essentially required to participate in many aspects of the modern economy. All of this is a mistake. You can't resist all lock-in. The players involved effectively run the world. You should still try where you can, and we should still be happy when tech companies either slip up or make the momentary strategic decision to make this easier

3 more replies

tambourine_man1y ago

Praising is good. Gratitude is a bit much. They got this big by selling user generated content and private info to the highest bidder. Often through questionable means.

Also, the underdog always touts Open Source and standards, so it’s good to remain skeptical when/if tables turn.

sheepscreek1y ago

All said and done, it is a very expensive and balsy way to undercut competitors. They’ve spent > $5B on hardware alone, much of which will depreciate in value quickly.

Pretty sure the only reason Meta’s managed to do this is because of Zuck’s iron grip on the board (majority voting rights). This is great for Open Source and regular people though!

3 more replies

jart1y ago

I'm perfectly happy with them draining the life essence out of the people crazy enough to still use Facebook, if they're funneling the profits into advancing human progress with AI. It's an Alfred Nobel kind of thing to do.

1 more reply

ricardo811y ago

>selling user generated content and private info to the highest bidder

Was always their modus operandi, surely. How else would they have survived.

Thanks for returning everyone else;s content and never mind all the content stealing your platform did.

swyx1y ago

> the AI folks at Meta deserve our praise and gratitude

We interviewed Thomas who led Llama 2 and 3 post training here in case you want to hear from someone closer to the ground on the models https://www.latent.space/p/llama-3

germinalphrase1y ago

"Come to think of it, the negative impact on such competitors was likely a key motivation for releasing the new models."

"Commoditize Your Complement" is often cited here: https://gwern.net/complement

pwdisswordfishd1y ago

Makes me wonder why he's really doing this. Zuckerberg being Zuckerberg, it can't be out of any genuine sense of altruism. Probably just wants to crush all competitors before he monetizes the next generation of Meta AI.

7 more replies

y04nn1y ago

Don't be fooled, it is a "embrace extend extinguish" strategy. Once they have enough usage and be the default standard they will start to find any possible ways to make you pay.

2 more replies

troupo1y ago

There's nothing open source about it.

It's a proprietary dump of data you can't replicate or verify.

What were the sources? What datasets it was trained on? What are the training parameters? And so on and so on

tintor1y ago

> they have brought the cost of pretraining a frontier 400B+ parameter model to ZERO

It is still far from zero.

1 more reply

tyler-jn1y ago

So far, it seems like this release has done ~nothing to the stock price for GOOGL/MSFT, which we all know has been propped up largely on the basis of their AI plans. So it's probably premature to say that this has messed it up for them.

throwaway_24941y ago

> We’re releasing Llama 3.1 405B

Is it possible to run this with ollama?

vorticalbox1y ago

If you have the ram for it.

Ollama will offload as many layers as it can to the gpu then the rest will run on the cpu/ram.

jessechin1y ago

Sure, if you have a H100 cluster. If you quant it to int4 you might get away with using only 4 H100 GPUs!

1 more reply

Havoc1y ago

If you want your first token around tomorrow lunch sure

sandworm1011y ago

>> Bravo! While I don't agree with Zuck's views and actions on many fronts, on this occasion I think he and the AI folks at Meta deserve our praise and gratitude.

Nope. Not one bit. Supporting F/OSS when it suits you in one area and then being totally dismissive of it in every other area should not be lauded. How about open sourcing some of FB's VR efforts?

1 more reply

sebastiennight1y ago

I've summarized this entire thread in 4 lines (didn't even use AI for it!)

Step 1. Chick-Fil-A releases a grass-fed beef burger to spite other fast-food joints, calls it "the vegan burger"

Step 2. A couple of outraged vegans show up in the comments, pointing out that beef, even grass-fed beef, isn't vegan

Step 3. Fast food enthusiasts push back: it's unreasonable to want companies to abide by this restrictive definition of "vegan". Clearly this burger is a gamechanger and the definition needs to adapt to the times.

Step 4. Goto Step 2 in an infinite loop

nathansherburn1y ago

Open source software is one of our best and most passionately loved inventions. It'd be much easier to have a nuanced discussion about "open weights" but I don't think that's in Facebook's interest.

llm_trw1y ago

More like vegetarians show up claiming to be vegans, then vegans show up and explain why eating animal products is still wrong.

That's the difference between open source and free software.

1 more reply

dilliwal1y ago

On point, and pretty good analogy

tmsh1y ago

Software 2.0 is about open licensing.

I.e., the more important thing - the more "free" thing - is the licensing now.

E.g., I play around with different image diffusion models like Stable Diffusion and specific fine-tuned variations for ControlNet or LoRA that I plug into ComfyUI.

But I can't use it at work because of the licensing. I have to use InvokeAI instead of ComfyUI if I want to be careful and only very specific image diffusion models without the latest and greatest fine-tuning. As others have said - the weights themselves are rather inscrutable. So we're building on more abstract shapes now.

But the key open thing is making sure (1) the tools to modify the weights are open and permissive (ComfyUI, related scripts or parts of both the training and deployment) and (2) the underlying weights of the base models and the tools to recreate them have MIT or other generous licensing. As well as the fine-tuned variants for specific tasks.

It's not going to be the naive construction in the future where you take a base model and as company A you produce company A's fine tuned model and you're done.

It's going to be a tree of fine-tuned models as a node-based editor like ComfyUI already shows and that whole tree has to be open if we're to keep the same hacker spirit where anyone can tinker with it and also at some point make money off of it. Or go free software the whole way (i.e., LGPL or equivalent the whole tree of tools).

In that sense unfortunately Llama has a ways to go to be truly open: https://news.ycombinator.com/item?id=36816395

Palmik1y ago

In the LLM world there are many open source solutions to find tuning, maybe the best one being from Meta: https://github.com/pytorch/torchtune

In terms of inference and interface (since you mentioned comfy) there are many truly open source options such as vLLM (though there isn't a single really performant open source solution for inference yet).

1 more reply

bjt123451y ago

The LLAMA 3.1 license addresses some of this.

kart231y ago

> This is how we’ve managed security on our social networks – our more robust AI systems identify and stop threats from less sophisticated actors who often use smaller scale AI systems.

Ok, first of all, has this really worked? AI moderators still can't capture the mass of obvious spam/bots on all their platforms, threads included. Second, AI detection doesn't work, and with how much better the systems are getting, it's probably never going to, unless you keep the best models for yourself, and it's is clear from the rest of the note that its not zuck's intention to do so.

> As long as everyone has access to similar generations of models – which open source promotes – then governments and institutions with more compute resources will be able to check bad actors with less compute.

This just doesn't make sense. How are you going to prevent AI spam, AI deepfakes from causing harm with more compute? What are you gonna do with more compute about nonconsensual deepfakes? People are already using AI to bypass identity verification on your social media networks, and pump out loads of spam.

simonw1y ago

"AI detection doesn't work, and with how much better the systems are getting, it's probably never going to, unless you keep the best models for yourself"

I don't think that's true. I don't think even the best privately held models will be able to detect AI text reliably enough for that to be worthwhile.

zmmmmm1y ago

I found this dubious as well, especially how it is portrayed as a simple game of compute power. For a start, there is an enormous asymmetry which is why we have a spam problem in the first place. For example a single bot can send out millions of emails at almost no cost and we have to expend a lot more "energy" to classify each one and decide if it's spam or not. So you don't just need more compute power you need drastically more compute power, and as AI models improve and get refined, the operation at ten times the scale is probably going to be marginally better, not orders of magnitude better.

I still agree with his general take - bad actors will get these models or make them themselves, you can't stop it. But the logic about compute power is odd.

OpenComment1y ago

Interesting quotes. Less sophisticated actors just means humans who already write in 2020 what the NYT wrote in early 2022 to prepare for Biden's State Of The Union 180° policy reversals (manufacturing consent).

FB was notorious for censorship. Anyway, what is with the "actions/actors" terminology? This is straightforward totalitarian language.

resters1y ago

This is really good news. Zuck sees the inevitability of it and the dystopian regulatory landscape and decided to go all in.

This also has the important effect of neutralizing the critique of US Government AI regulation because it will democratize "frontier" models and make enforcement nearly impossible. Thank you, Zuck, this is an important and historic move.

It also opens up the market to a lot more entry in the area of "ancillary services to support the effective use of frontier models" (including safety-oriented concerns), which should really be the larger market segment.

war3211y ago

Unfortunately, there are a number of AI safety people that are still crowing about how AI models need to be locked down, with some of them loudly pivoting to talking about how open source models aid China.

Plus there's still the spectre of SB-1047 hanging around.

passion__desire1y ago

Probably, Yann Lecun is the Lord Varys here. He has Mark's ear and Mark believes in Yann's vision.

btbuildem1y ago

The "open source" part sounds nice, though we all know there's nothing particularly open about the models (or their weights). The barriers to entry remain the same - huge upfront investments to train your own, and steep ongoing costs for "inference".

Is the vision here to treat LLM-based AI as a "public good", akin to a utility provider in a civilized country (taxpayer funded, govt maintained, non-for-profit)?

I think we could arguably call this "open source" when all the infra blueprints, scripts and configs are freely available for anyone to try and duplicate the state-of-the-art (resource and grokking requirements nonwithstanding)

brrrrrm1y ago

check out the paper. it's pretty comprehensive https://ai.meta.com/research/publications/the-llama-3-herd-o...

amusingimpala751y ago

Sure but under what license? Because slapping “open source” on the model doesn’t make it open source if it’s not actually license that way. The 3.1 license still contains their non-commercial clause (over 700m users) and requires derivatives, whether fine tunings or trained on generated data, to use the llama name.

redleader551y ago

"Use it for whatever you want(conditions apply), but not if you are Google, Amazon, etc. If you become big enough talk to us." That's how I read the license, but obviously I might be missing some nuance.

mesebrec1y ago

You also can't use it for training or improving other models.

You also can't use it if you're the government of India.

Neither can sex workers use it. (Do you know if your customers are sex workers?)

There are also very vague restrictions for things like discrimination, racism etc.

2 more replies

frabcus1y ago

Also it isn't source code, it is a binary. You need at least the data curation code and preferably the data itself for it to be actually source code in the practical sense that anyone can remake the build.

Llama could change the license on later versions to kill your business and you have no options as you don't know how they trained it or have the budget to.

It's not much more free than binary software.

msnkarthik1y ago

Interesting discussion! While I agree with Zuckerberg's vision, the comments raise valid concerns. The point about GPU accessibility and cost is crucial. Public clusters are great, but sustainable funding and equitable access are essential to avoid exacerbating existing inequalities. I also resonate with the call for CUDA alternatives. Breaking the dependence on proprietary technology is key for a truly open AI ecosystem. While existing research clusters offer some access, their scope and resources often pale in comparison to what companies like Meta are proposing. We need a multi-pronged approach: open-sourcing models AND investing in accessible infrastructure, diverse hardware options, and sustainable funding models for a truly democratic AI future.

narrator1y ago

3nm chip fabs take years to build. You don't just go to AWS and spin one up. This is the very hard part about AI that breaks a lot of the usual tech assumptions. We have entered a world where suddenly there isn't enough compute, because it's just too damn hard to build capacity and that's different from the past 40 years.

fnordpiglet1y ago

I suspect we are still early in the optimization evolution. The weights are what matter. The ability to run them anywhere might come.

1 more reply

6gvONxR4sf7o1y ago

> Third, a key difference between Meta and closed model providers is that selling access to AI models isn’t our business model. That means openly releasing Llama doesn’t undercut our revenue, sustainability, or ability to invest in research like it does for closed providers. (This is one reason several closed providers consistently lobby governments against open source.)

The whole thing is interesting, but this part strikes me as potentially anticompetitive reasoning. I wonder what the lines are that they have to avoid crossing here?

phkahler1y ago

>> ...but this part strikes me as potentially anticompetitive reasoning.

"Commoditize your complements" is an accepted strategy. And while pricing below cost to harm competitors is often illegal, the reality is that the marginal cost of software is zero.

1 more reply

nailer1y ago

Llama isn't open source. The license is at https://llama.meta.com/llama3/license/ and includes various restrictions on use, which means it falls outside the rules created by the https://opensource.org/osd

ssahoo1y ago

Additional Commercial Terms. If, on the Llama 3.1 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights.

Which open-source has such restrictions and clause?

1 more reply

openrisk1y ago

Open source "AI" is a proxy for democratising and making (much) more widely useful the goodies of high performance computing (HPC).

The HPC domain (data and compute intensive applications that typically need vector, parallel or other such architectures) have been around for the longest time, but confined to academic / government tasks.

LLM's with their famous "matrix multiply" at their very core are basically demolishing an ossified frontier where a few commercial entities (Intel, Microsoft, Apple, Google, Samsung etc) have defined for decades what computing looks like for most people.

Assuming that the genie is out of the bottle, the question is: what is the shape of end-user devices that are optimally designed to use compute intensive open source algorithms? The "AI PC" is already a marketing gimmick, but could it be that Linux desktops and smartphones will suddenly be "ΑΙ natives"?

For sure its a transformational period and the landscape T+10 yrs could be drastically different...

frabcus1y ago

Unfortunately it is barely more open source than Windows. Llama 3 weights are binary code and while the license is pretty good it isn't open source.

avivo1y ago

The FTC also recently put out a statement that is fairly pro-open source: https://www.ftc.gov/policy/advocacy-research/tech-at-ftc/202...

I think it's interesting to think about this question of open source, benefits, risk, and even competition, without all of the baggage that Meta brings.

I agree with the FTC, that the benefits of open-weight models are significant for competition. The challenge is in distinguishing between good competition and bad competition.

Some kind of competition can harm consumers and critical public goods, including democracy itself. For example, competing for people's scarce attention or for their food buying, with increasingly optimized and addictive innovations. Or competition to build the most powerful biological weapons.

Other kinds of competition can massively accelerate valuable innovation.

The FTC must navigate a tricky balance here — leaning into competition that serves consumers and the broader public, while being careful about what kind of competition it is accelerating that could cause significant risk and harm.

It's also obviously not just "big tech" that cares about the risks behind open-weight foundation models. Many people have written about these risks even before it became a subject of major tech investment. (In other words, A16Z's framing is often rather misleading.) There are many non-big tech actors who are very concerned about current and potential negative impacts of open-weight foundation models.

One approach which can provide the best of both worlds, is for cases where there are significant potential risks, to ensure that there is at least some period of time where weights are not provided openly, in order to learn a bit about the potential implications of new models.

Longer-term, there may be a line where models are too risky to share openly, and it may be unclear what that line is. In that case, it's important that we have governance systems for such decisions that are not just profit-driven, and which can help us continue to get the best of all worlds. (Plug: my organization, the AI & Democracy Foundation; https://ai-dem.org/; is working to develop such systems and hiring.)

whimsicalism1y ago

making food that people want to buy is good actually

i am not down with this concept of the chattering class deciding what are good markets and what are bad, unless it is due to broad-based and obvious moral judgements.

1 more reply

benreesman1y ago

In general I look back on my time at FB with mixed feelings, I’m pretty skeptical that modern social media is a force for good and I was there early enough to have moved the needle.

But this is really positive stuff and it’s nice to view my time there through the lens of such a change for the better.

Keep up the good work on this folks.

Time to start thinking about opening up a little on the training data.

frabjoused1y ago

Who knew FB would hold OpenAI's original ideals, and OpenAI now holds early FB ideals/integrity.

boringg1y ago

FB needed to differentiate drastically. FB is at its best creating large data infra.

krmboya1y ago

Mark Zuckerberg was attacked by the media when it suited their tech billionaire villain narrative. Now there's Elon Musk so Zuckerberg gets to be on the good side again

smusamashah1y ago

Meta's article with more details on the new LLAMA 3.1 https://ai.meta.com/blog/meta-llama-3-1/

Invictus01y ago

The irony of this letter being written by Mark Zuckerburg at Meta, while OpenAI continues to be anything but open, is richer than anyone could have imagined.

userabchn1y ago

Interview with Mark Zuckerberg released today: https://www.bloomberg.com/news/videos/2024-07-23/mark-zucker...

sensanaty1y ago

Meanwhile Facebook is flooded with AI-generated slop with hundreds of thousands of other bots interacting with it to boost it to whoever is insane enough to still use that putrid hellhole of a mass-data-harvesting platform.

Dead internet theory is very much happening in real time, and I dread what's about to come since the world has collectively decided to lose their minds with this AI crap. And people on this site are unironically excited about this garbage that is indistinguishable from spam getting more and more popular. What a fucking joke

13 more replies

Simon_ORourke1y ago

I thoroughly support Meta's open-sourcing of these AI models going forward. However, for a company that absolutely closed down discussions about providing API access to their platform, I'm left wondering what's in it (monetarily) for them by doing this? Is it to simply undercut competition in the space, like some grocery store selling below cost?

GreenWatermelon1y ago

Meta open sources its tools: React, GraphQL, PyTorch, and now these new models. Meta seems to be about open sourcing tools, not providing open access to their platforms.

The AI model complements the platform, and their platform is the money maker. They hold the belief that open sourcing their tools benefit their platform on the long run, which is why they're doing it. And in doing so, they aren't under the control of any competitors.

I would say it's more like a grocery store providing free parking, a bus stop, self-checkout, online menu, and free delivery.

yard20101y ago

Stay assured these guys are working day and night to make our world a darker place

indus1y ago

Is there an argument against Open Source AI?

Not the usual nation-state rhetoric, but something that justifies that closed source leads to better user-experience and fewer security and privacy issues.

An ecosystem that benefits vendors, customers, and the makers of close source?

Are there historical analogies other than Microsoft Windows or Apple iPhone / iOS?

kjkjadksj1y ago

Lets take the iphone. Secured by the industries best security teams I am sure. Closed source, yet teenagers in eastern europe have cracked into it dozens of times making jailbreaks. Every law enforcement agency can crack into it. Closed source is not a security moat, but a trade protection moat.

finolex11y ago

Replace "Open Source AI" in "is there an argument against xxx" with bioweapons or nuclear missiles. We are obviously not at that stage yet, but it could be a real, non-trivial concern in the near future.

twelve401y ago

It'll be interesting to come back here in a couple of years and see what's left. What do they even do anymore? They have Facebook, which hasn't visibly changed in a decade. They have Instagram, which feels a bit sleeker but also remained more or less the same. and Whatsapp. Ad network that runs on top of those services and floods them with trash. Bunch of stuff that doesn't seem to exist anymore - Libra, the grandiose multibillion dollar Legless VR, etc.

But they still have 70 thousand people (a small country) doing _something_. What are they doing? Updating Facebook UI? Not really, the UI hasn't been updated, and you don't need 70 thousand people to do that. Stuff like React and Llama? Good, I guess, we'll see how they make use of Llama in a couple of years. Spellcheck for posts maybe?

therealdrag01y ago

And still making 135B dollars in revenue, or 2M per employee. I don’t know what they do either lol, but I don’t mind that revenue supporting jobs.

GaggiX1y ago

Llama 3.1 405B is on par with GPT-4o and Claude 3.5 Sonnet, the 70B model is better than GPT 3.5 turbo, incredible.

j_m_b1y ago

> We need to protect our data.

This is a very important concern in Health Care because of HIPAA compliance. You can't just send your data over the wire to someone's proprietary API. You would at least need to de-identify your data. This can be a tricky task, especially with unstructured text.

wesleyyue1y ago

Just added Llama 3.1 405B/70B/8B to https://double.bot (VSCode coding assistant) if anyone would like to try it.

---

Some observations:

* The model is much better at trajectory correcting and putting out a chain of tangential thoughts than other frontier models like Sonnet or GPT-4o. Usually, these models are limited to outputting "one thought", no matter how verbose that thought might be.

* I remember in Dec of 2022 telling famous "tier 1" VCs that frontier models would eventually be like databases: extremely hard to build, but the best ones will eventually be open and win as it's too important to too many large players. I remember the confidence in their ridicule at the time but it seems increasingly more likely that this will be true.

1 more reply

anthomtb1y ago

> My framework for understanding safety is that we need to protect against two categories of harm: unintentional and intentional. Unintentional harm is when an AI system may cause harm even when it was not the intent of those running it to do so. For example, modern AI models may inadvertently give bad health advice. Or, in more futuristic scenarios, some worry that models may unintentionally self-replicate or hyper-optimize goals to the detriment of humanity. Intentional harm is when a bad actor uses an AI model with the goal of causing harm.

Okay then Mark. Replace "modern AI models" with "social media" and repeat this statement with a straight face.

AMICABoard1y ago

Okay if anyone wants to try Llama 3.1 inference on CPU, try this: https://github.com/trholding/llama2.c (L2E)

It's a bit buggy but it is fun.

Disclaimer: I am the author of L2E

thntk1y ago

When Zuck said spy can easily steal models, I wonder how much of it comes from experiences. I remember they struggled to train OPT not long ago.

On a more serious note, I don't really buy his arguments about safety. First, widespread AI does not reduce unintentional harm but increases it, because the rate of accident is compound. Second, the chance of success for threat actors will increase, because of the asymmetric advantage of gaining access to all open information and hiding their own information. But there is no reverse at this point, I enjoy it while it lasts, AGI will come sooner or later anyway.

itissid1y ago

How are smaller models distilled from large models, I know of LoRA, quantization like technique; but does distilling also mean generating new datasets for conversing with smaller models entirely from the big models for many simpler tasks?

tintor1y ago

Smaller models can be trained to match log probs of the larger model. Larger model can be used to generate synthethic data for the smaller model.

Havoc1y ago

I believe these were separately trained not distilled. Though the 405B was used for synth data for the other two

carimura1y ago

Looks like you can already try out Llama-3.1-405b on Groq, although it's timing out. So. Hugged I guess.

TechDebtDevin1y ago

All the big providers should have it up by end of day. They just change their API configs (they're just reselling you AWS Bedrock).

jamiedg1y ago

405B and the other Llama 3.1 models are working and available on Together AI. https://api.together.ai

Havoc1y ago

>they're just reselling you AWS Bedrock

Meta announced they have 25 providers ready on day 1, so no it's not all AWS.

ChanderG1y ago

I think all this discussion around Open-source AI is a total distraction from the elephants in the room. Let's list what you need to run/play around with something like Llama:

1. Software: this is all Pytorch/HF, so completely open-source. This is total parity between what corporates have and what the public has.

2. Model weights: Meta and a few other orgs release open models - as opposed to OpenAI's closed models. So, ok, we have something to work with.

3. Data: to actually do anything useful you need tons of data. This is beyond the reach of the ordinary man, setting aside the legality issues.

4. Hardware: GPUs, which are extremely expensive. Not just that, even if you have the top dollars, you have to go stand in a queue and wait for O(months), since mega-corporates have gotten there before you.

For Inference, you need 1,2 and 4. For training (or fine-tuning), you need all of these. With newer and larger models like the latest Llama, 4 is truly beyond the reach of ordinary entities.

This is NOTHING like open-source, where a random guy can edit/recompile/deploy software on a commodity computer. Wrt LLMs, Data/Hardware are in the equation, the playing field is complete stacked. This thread has a bunch of people discussing nuances of 1 and 2, but this bike-shedding only hides the basic point: Control of LLMs are for mega-corps, not for individuals.

2 more replies

maxdo1y ago

I'm really unsure if it's a good idea given the current geopolitics.

Open-Source Code in the past was fantastic because the West had a monopoly on CPUs and computers. Sharing and contributing was amazing while ensured that tyrants couldn't use this tech to harm people simply because they don't have a hardware to run.

But now, things are different. China is advancing in chip technology, and Russia is using open-source AI to harm people on the scale today, with auto-targeting drones being just the start. Red sea conflict etc.

And somehow, Zuckerberg keeps finding ways to mess up people's lives, despite having the best intentions.

Right now you can build a semi-autonomous drone with AI to kill people for ~$500-700. The western world will still use safe and secure commercial models. While new axis of evil will use models based on Meta or any other open source to do whatever harm they can imagine with not a hint of control.

This particular model. Fine-tune it to develop a nuclear bomb using all possible research that level of government can get on the scale. Killing drone swarms etc. Once the knowledge got public these models can be a base model to get expert-level knowledge to anyone who wants it, uncensored. Especially if you are government that wants to destroy a peaceful order for whatever reason.

5 more replies

blackeyeblitzar1y ago

Only if it is truly open source (open data sets, transparent curation/moderation/censorship of data sets, open training source code, open evaluation suites, and an OSI approved open source license).

Open weights (and open inference code) is NOT open source, but just some weak open washing marketing.

The model that comes closest to being TRULY open is AI2’s OLMo. See their blog post on their approach:

https://blog.allenai.org/hello-olmo-a-truly-open-llm-43f7e73...

I think the only thing they’re not open about is how they’ve curated/censored their “Dolma” training data set, as I don’t think they explicitly share each decision made or the original uncensored dataset:

https://blog.allenai.org/dolma-3-trillion-tokens-open-llm-co...

By the way, OSI is working on defining open source for AI. They post weekly updates to their blog. Example:

https://opensource.org/blog/open-source-ai-definition-weekly...

JumpCrisscross1y ago

> Only if it is truly open source (open data sets, transparent curation/moderation/censorship of data sets, open training source code, open evaluation suites, and an OSI approved open source license)

You’re missing a then to your if. What happens if it’s “truly” open per your definition versus not?

blackeyeblitzar1y ago

I think you are asking what the benefits are? The main benefit is that we can trust what these systems are doing better. Or we can self host them. If we just take the weights, then it is unclear how these systems might be lying to us or manipulating us.

Another benefit is that we can learn from how the training and other steps actually work. We can change them to suit our needs (although costs are impractical today). Etc. It’s all the usual open source benefits.

haolez1y ago

There is also the risk of companies like Meta introducing ads in the training itself, instead of inference time.

itissid1y ago

Yeah, though I do wonder for a big model like 405B if the original training recipe, really matters for where models are heading, practically speaking which is smaller and more specific?

I imagine its main use would be to train other models by distilling them down with LoRA/Quantization etc(assuming we have a tokenizer). Or use them to generate training data for smaller models directly.

But, I do think there is always a way to share without disclosing too many specifics, like this[1] lecture from this year's spring course at Stanford. You can always say, for example:

- The most common technique for filtering was using voting LLMs (without disclosing said llms or quantity of data).

- We built on top of a filtering technique for removing poor code using ____ by ____ authors (without disclosing or handwaving how you exactly filtered, but saying that you had to filter).

- We mixed certain proportion of this data with that data to make it better (without saying what proportion)

[1] https://www.youtube.com/watch?v=jm2hyJLFfN8&list=PLoROMvodv4...

zoogeny1y ago

Totally tangential thought, probably doomed to be lost in the flood of comments on this very interesting announcement.

I was thinking today about Musk, Zuckerberg and Altman. Each claims that the next version of their big LLMs will be the best.

For some reason it reminded me of one apocryphal cause of WW1, which was that the kings of Europe were locked in a kind of ego driven contest. It made me think about the Nation State as a technology. In some sense, the kings were employing the new technology which was clearly going to be the basis for the future political order. And they were pitting their own implementation of this new technology against the other kings.

I feel we are seeing a similar clash of kings playing out. The claims that this is all just business or some larger claim about the good of humanity seem secondary to the ego stakes of the major players. And when it was about who built the biggest rocket, it felt less dangerous.

It breaks my heart just a little bit. I feel sympathy in some sense for the AIs we will create, especially if they do reach the level of AGI. As another tortured analogy, it is like a bunch of competitive parents forcing their children into adversarial relationships to satisfy the parent's ego.

rldjbpin1y ago

this is very cool indeed that meta has made available more than they need to in terms of model weights.

however, the "open-source" narrative is being pushed a bit too much like descriptive ML models were called "AI", or applied statistics "data science". with reinforced examples such as this, we start to lose the original meaning of the term.

the current approach of startups or small players "open-sourcing" their platforms and tools as a means to promote network effect works but is harmful in the long run.

you will find examples of terraform and red hat happening, and a very segmented market. if you want the true spirit of open-source, there must be a way to replicate the weights through access to training data and code. whether one could afford millions of GPU hours or not, real innovation would come from remixing the internals, and not just fine-tuning existing stuff.

i understand that this is not realistically going to ever happen, but don't perform deceptive marketing at the same time.

jmward011y ago

I never thought I would say this but thanks Meta.

*I reserve the right to remove this praise if they abuse this open source model position in the future.

frabcus1y ago

If it was actually open source with data and the data curation code releases, they wouldn't be able to abuse it the same way. It is open weights, closed training data.

mav3ri3k1y ago

I am not deep into llms so I ask this. From my understanding, their last model was open source but it was in a way that you can use them but the inner working were "hidden"/not transparent.

With the new model, I am seeing alot of how open source they are and can be build upon. Is it now completely open source or similar to their last models ?

whimsicalism1y ago

It's intrinsic to transformers that the inner workings are largely inscrutable. This is no different, but it does not mean they cannot be built upon.

Gradient descent works on these models just like the prior ones.

zmmmmm1y ago

they give you the code and they give you the model it runs, and you can customise and redistribute both. It's all open source in that respect.

What people are complaining about (totally unreasonably in my view) is obviously Meta is not "open sourcing" all the training data, so nobody can retrain the model from scratch themselves. This argument to me is just silly. The whole point of these models is they distil pretraining on massive data sets you wouldn't have access to otherwise. If you insist on them releasing the data set, they will have to cut it down to 0.1% of the size and you will be getting what you had access to already in the first place.

2 more replies

Sparkyte1y ago

The real path forward is recognizing what AI is good at and what it is bad at. Focus on making what it is good at even better and faster. Open AI will definitely give us that option but it isn't a miracle worker.

My impression is that AI if done correctly will be the new way to build APIs with large data sets and information. It can't write code unless you want to dump billions of dollars into a solution with millions of dollars of operational costs. As it stands it loses context too quickly to do advance human tasks. BUT this is where it is great at assembling data and information. You know what is great at assembling data and information? APIs.

Think of it this way if we can make it faster and it trains on a datalake for a company it could be used to return information faster than a nested micro-service architecture that is just a spiderweb of dependencies.

Because AI loses context simple API requests could actually be more efficient.

jll291y ago

The question is what is "open source" in the case of a matrix of numbers, as opposed to code.

Also, are there any "IP" rights attached at all to a bunch numbers coming out of a formula that someone else calculated for you? (edit: after all, a "model" is just a matrix of numbers coming out of running a training algorithm that is not owned by Meta over training data that is not owned by Meta.)

Meta imposes a notification duty AND a request for another license (no mention of the details of these) for applications of their model with a large number of users. This is against the spirit of open source. (In practical terms it is not a show stopper since you can easily switch models, although they all have subtlely different behaviours and quality levels.)

xpe1y ago

Zuck needs to get real. They are Open Weights not Open Source.

suyash1y ago

Open source is a welcome step but what we really need is complete decentralisation so people can run their own private AI Models that keep all the data private to them. We need this to happen locally on laptops, mobile phones, smart devices etc. Waiting for when that will become ubiquitous.

frabcus1y ago

It is open weights not open source. If you can't train it and don't know the training data and can't use it to train your own models, it is a closed model aa a whole. Even if you have the binary weights.

tananaev1y ago

I don't think weights is the source. Data is the source. But still better than nothing.

tarruda1y ago

From the "Why Open Source AI Is Good for Meta" section, none of the four given reasons seem to justify spending so much money to train these powerful models and give them away for free.

Maybe this is a strategic play to hurt other AI companies that depend on this business model?

throwaway11941y ago

I strongly suspect that what AI will end up doing is push companies and organizations towards open source, they will eventually realize that code is already being shared via AI channels, so why not do it legally with open source?

talldayo1y ago

> they will eventually realize that code is already being shared via AI channels

Private repos are not being reproduced by any modern AI. Their source code is safe, although AI arguably lowers the bar to compete with them.

typpo1y ago

Thanks to Meta for their work on safety, particularly Llama Guard. Llama Guard 3 adds defamation, elections, and code interpreter abuse as detection categories.

Having run many red teams recently as I build out promptfoo's red teaming featureset [0], I've noticed the Llama models punch above their weight in terms of accuracy when it comes to safety. People hate excessive guardrails and Llama seems to thread the needle.

Very bullish on open source.

[0] https://www.promptfoo.dev/docs/red-team/

swyx1y ago

is there a #2 to llamaguard? Meta seems curiously alone in doing this kind of, lets call it, "practical safety" work

popcorncowboy1y ago

> Developers can run inference on Llama 3.1 405B on their own infra at roughly 50% the cost of using closed models like GPT-4o

Does anyone have details on exactly what this means or where/how this metric gets derived?

rohansood151y ago

I am guessing these are prices on services like AWS Bedrock (their post is down right now).

PlattypusRex1y ago

a big chunk of that is probably the fact that you don't need to pay someone who is trying to make a profit by running inference off-premises.

pjkundert1y ago

Deployment of PKI-signed distributed software systems to use community-provisioned compute, bandwidth and storage at scale is, now quite literally, the future.

We mostly don’t all want or need the hardware to run these AIs ourselves, all the time. But, when we do, we need lots of it for a little while.

This is what Holochain was born to do. We can rent massive capacity when we need it, or earn money renting ours when we don’t.

All running cryptographically trusted software at Internet scale, without the knowledge or authorization of commercial or government “do-gooders”.

Exciting times!

Dwedit1y ago

Without the raw data that trained the model, how is it open source?

scotty791y ago

It's more like a freeware than open source. You can launch it on your hardware and use it but how it was created is mostly not published.

Still huge props to them for doing what they do.

InDubioProRubio1y ago

CrowdStrike just added "Centralized Company Controlled Software Ecosystem" to every risk data sheet on the planet. Everything futureproof is self-hosted and open source.

ohthehugemanate1y ago

Has anyone taken apart the llama community license and compared it to validated open source licenses? Red Hat is making a big deal about releasing the Granite LLM released under Apache. Is there a real difference between that and what Llama does?

https://www.redhat.com/en/topics/ai/open-source-llm

FinchNova121y ago

> Our adversaries are great at espionage, stealing models that fit on a thumb drive is relatively easy, and most tech companies are far from operating in a way that would make this more difficult.

Mostly unrelated to the correctness of the article, but this feels like a bad argument. AFAIK, Anthropic/OpenAI/Google are not having issues with their weights being leaked (are they?). Why is it that Meta's model weights are?

skybrian1y ago

I think it’s hard to say. We simply don’t know much from the outside. Microsoft has had some pretty bad security lapses, for example around guarding access to Windows source code. I don’t think we’ve seen a bad security break-in at Google in quite a few years? It would surprise me if Anthropic and OpenAI had good security since they’re pretty new, and fast-growing startups have a lot of organizational challenges.

It seems safe to assume that not all the companies doing leading-edge LLM’s have good security and that the industry as a whole isn’t set up to keep secrets for long. Things aren’t locked down to the level of classified research. And it sounds like Zuckerberg doesn’t want to play the game that way.

At the state level, China has independent AI research efforts and they’re going to figure it out. It’s largely a matter of timing, which could matter a lot.

There’s still an argument to be made against making proliferation too easy. Just because states have powerful weapons doesn’t mean you want them in the hands of people on the street.

whimsicalism1y ago

We have no way of knowing whether nation-state level actors have access to those weights.

dfadsadsf1y ago

We have nationals/citizens of every major US adversary working in those companies with looser security practice than security at local warehouse. Security check before hiring is a joke (mostly checks that resume checks out), laptops can be taken home and internal communication are not segmented on need to know basis. Essentially if China wants weights or source code, it will have hundreds of people to choose from who can provide it.

meowface1y ago

>AFAIK, Anthropic/OpenAI/Google are not having issues with their weights being leaked. Why is it that Meta's model weights are?

The main threat actors there would be powerful nation-states, in which case they'd be unlikely to leak what they've taken.

It is a bad argument though, because one day possession of AI models (and associated resources) might confer great and dangerous power, and we can't just throw up our hands and say "welp, no point trying to protect this, might as well let everyone have it". I don't think that'll happen anytime soon, but I am personally somewhat in the AI doomer camp.

ChrisArchitect1y ago

Llama 3.1 Official Launch

https://news.ycombinator.com/item?id=41046540

largbae1y ago

The All-In podcast predicted this exact strategy for keeping OpenAI and other upstarts from disrupting the existing big tech firms.

By giving away higher and higher quality models, they undermine the potential return on investment for startups who seek money to train their own. Thus investment in foundation model building stops and they control the ecosystem.

2 more replies

Gravityloss1y ago

Can't it be divided into multiple parts to have a more meaningful discussion? For example the terminology could identify four key areas:

    - Open training data (this is very big)
    - Open training algorithms (does it include infrastructure code?)
    - Open weights (result of previous two)
    - Open runtime algorithm

turingbook1y ago

Open-weights models are not really open source.

elecush1y ago

Ok one notable difference: did the linux researchers of yore warn about adversarial giants getting this tech? Or is this unique to the current moment? That for me is the largest question when considering the logical progression on "linux open is better therefore ai open is better".

Spivak1y ago

We can't open source Linux because bad people might run servers?

Can you imagine the disinformation they could spread with those? With enough of them you could have a massively global site made entirely for spreading it. God what if such a thing got into the hands of an egocentric billionaire?

1 more reply

pja1y ago

“Commoditise your complement” in action!

fsndz1y ago

Small language models is the path forward https://medium.com/thoughts-on-machine-learning/small-langua...

fimdomeio1y ago

This is very amusing:

- We need to control our own destiny and not get locked into a closed vendor. - We need to protect our data. - We want to invest in the ecosystem that’s going to be the standard for the long term.

Thank you Meta for being the bright light of ethical guidance for us all.

fishermanbill1y ago

Its not open source.

We don't get the data or training code. The small runtime framework is open source but that's of little use as its largely fixed in implementation due to the weights. Yes we can fine tune but that is akin to modifying video games - we can do it but there's only so much you can do within reasonable effort and no one would call most video games 'open source'*.

Its freeware and Meta's strategy is much more akin to the strategy Microsoft used with Internet Explorer to capture the web browser market. No one was saying god bless Microsoft for trying to capture the browser market with I.E. Nothing wrong with Meta's strategy just don't call it open source.

*weights are data and so is the video/audio output of a video game. If we gave away that video game output for free we wouldn't call the video game open source as the myriad freeware games essentially do.

1 more reply

m3kw91y ago

The truth is we need both closed and open source, they both have their discovery path and advantages and disadvantages, there shouldn’t be a system where one is eliminated over the other. They also seem to be driving each other forward via competition.

frays1y ago

No matter how "open source" they actually will be, I'm glad this exists as a competitor to Gemini and ChatGPT to help push innovation from a different angle.

Can't wait to see how the landscape will look in 2027 and beyond.

Oras1y ago

This is obviously good news, but __personally__ I feel the open-source models are just trying to catch up with whoever the market leader is, based on some benchmarks.

The actual problem is running these models. Very few companies can afford the hardware to run these models privately. If you run them in the cloud, then I don't see any potential financial gain for any company to fine-tune these huge models just to catch up with OpenAI or Anthropic, when you can probably get a much better deal by fine-tuning the closed-source models.

Also this point:

> We need to protect our data. Many organizations handle sensitive data that they need to secure and can’t send to closed models over cloud APIs.

First, it's ironic that Meta is talking about privacy. Second, most companies will run these models in the cloud anyway. You can run OpenAI via Azure Enterprise and Anthropic on AWS Bedrock.

simonw1y ago

"Very few companies can afford the hardware to run these models privately."

I can run Llama 3 70B on my (64GB RAM M2) laptop. I haven't tried 3.1 yet but I expect to be able to run that 70B model too.

As for the 405B model, the Llama 3.1 announcement says:

> To support large-scale production inference for a model at the scale of the 405B, we quantized our models from 16-bit (BF16) to 8-bit (FP8) numerics, effectively lowering the compute requirements needed and allowing the model to run within a single server node.

nothrowaways1y ago

I love llamas but how fb can profit from leading this effort is not clear to me.

v3ss0n1y ago

We welcome Mark Zuckerberg's Redemption Arc! Opensource AI Here we go!

rbren1y ago

If you're interested in getting Llama 3.1 to build software, check out https://github.com/OpenDevin/OpenDevin

ayakang314151y ago

Massive props to AI teams at Meta that released this model open source

ofou1y ago

                Llama 3 Training System
                 Total: 19.2 exaFLOPS
                         |
            +-------------+-------------+
            |                           |
      Cluster 1               Cluster 2
    9.6 exaFLOPS             9.6 exaFLOPS
           |                       |
    +------+------+         +------+------+
    |             |         |             |
 12K GPUs      12K GPUs  12K GPUs      12K GPUs
    |             |         |             |
  [####]       [####]     [####]       [####]
  400+          400+      400+          400+
 TFLOPS/GPU   TFLOPS/GPU TFLOPS/GPU   TFLOPS/GPU

aliljet1y ago

And this is happening RIGHT as a new potential leader is emerging in Llama 3.1. I'm really curious about how this is going to match up on the leaderboards...

yard20101y ago

I don't believe in any word coming out from this lizard. He is the most evil villain I know, and I live in the middle east, can you imagine

ceva1y ago

They have earned so much money on all of their users, this is least they can do to give back to the community, if this can be considered that ;)

littlestymaar1y ago

I love how Zuck decided to play a new game called “commoditize some other billionaire's business to piss him”, I can't wait until this becomes a trend and we get plenty of open source cool stuff.

If he really wants to replicate Linux's success against proprietary Unices, he needs to release Llama with some kind of GPL equivalent, that forces everyone to play the open source game.

slowhadoken1y ago

First you’re going to have to write some laws that prevent openwashing and legitimate open source projects from becoming proprietary.

abetusk1y ago

Another case of "open-washing". Llama is not available open source, under the common definition of open source, as the license doesn't allow for commercial re-use by default [0].

They provide their model, with weights and code, as "source available" and it looks like they allow for commercial use until a 700M monthly subscriber cap is surpassed. They also don't allow you to train other AI models with their model:

""" ... v. You will not use the Llama Materials or any output or results of the Llama Materials to improve any other large language model (excluding Meta Llama 3 or derivative works thereof). ... """

[0] https://github.com/meta-llama/llama3/blob/main/LICENSE

sillysaurusx1y ago

They cannot legally enforce this, because they don’t have the rights to the content they trained it on. Whoever’s willing to fund that court battle would likely win.

There’s a legal precedent that says hard work alone isn’t enough to guarantee copyright, i.e. it doesn’t matter that it took millions of dollars to train.

whimsicalism1y ago

i think these clauses are unenforceable. it's telling that OAI hasn't tried a similar suit despite multiple extremely well-known cases of competitors training on OAI outputs

bigmattystyles1y ago

Am I too skeptical or is this the approach taken when they've decided they can't make proprietary work for them.

whimsicalism1y ago

OpenAI needs to release a new model setting a new capabilities highpoint. This is existential for them now.

BaculumMeumEst1y ago

What is the best of the llama 3.1 models that I can fine-tune with a macbook m3 max w/ 96GB of ram?

lostmsu1y ago

None unless you are prepared to spend 5+ years. Macs just don't have flops.

1 more reply

notfed1y ago

Great, maybe you (Meta) should actually release the source for your AI then?

rightbyte1y ago

What would be the speed of a querry of running this model from disk on a ordinary PC?

Has anyone tried that?

tpurves1y ago

405 sounds like a lot of B's! What do you need to practically run or host that yourself?

sumedh1y ago

You cannot run it locally

dev1ycan1y ago

Reality: they've realize gpt 4 is a wall, they can't keep pouring trillions of dollars into it for no improvement or little at all, so now they want to put it to the open source until someone figures out the next step then they'll take it behind closed doors again.

I hate how the moment it's too late will be, by design, closed doors.

Ukv1y ago

> Reality: they've realize gpt 4 is a wall, they can't keep pouring trillions of dollars into it for no improvement or little at all, so now they want to put it to the open source [...]

This is Meta (LLaMA, which has had available weights for a while), not OpenAI (GPT).

1 more reply

jorblumesea1y ago

Cynically I think this position is largely due to how they can undercut OpenAI's moat.

wayeq1y ago

It's not cynical, it's just an awareness that public companies have a fiduciary duty to their shareholders.

bzmrgonz1y ago

I see it as a new race to build the personal computer (PC) all over again. I hope we can apply the lessons learned and can jump into open source to speed up development and democratize ai for all. We know how Microsoft played dirty in the early days of the PC revolution.

baceto1231y ago

The value of AI is in the information used to train the models, not the hardware.

casebash1y ago

I expect this to end up having been one of the worst timed blog posts in history. Open source AI has mostly been good for the world up until now, but we're getting to the point where we're about to find out why open-sourcing sufficiently bad models is a terrible idea.

mmmore1y ago

I appreciate that Mark Zuckerberg soberly and neutrally talked about some of the risks from advances in AI technology. I agree with others in this thread that this is more accurately called "public weights" instead of open source, and in that vein I noticed some issues in the article.

> This is one reason several closed providers consistently lobby governments against open source.

Is this substantially true? I've noticed a tendency of those who support the general arguments in this post to conflate the beliefs of people concerned about AI existential risk, some of whom work at the leading AI labs, with the position of the labs themselves. In most cases I've seen, the AI labs (especially OpenAI) have lobbied against any additional regulation on AI, including with SB1047[1] and the EU AI Act[2]. Can anyone provide an example of this in the context of actual legislation?

> On this front, open source should be significantly safer since the systems are more transparent and can be widely scrutinized. Historically, open source software has been more secure for this reason.

This may be true if we could actually understand what was happening in neural networks, or train them to consistently avoid unwanted behaviors. As things are, the public weights are simply inscrutable black boxes, and the existence of jailbreaks and other strange LLM behaviors show that we don't understand how our training processes create models' emergent behaviors. The capabilities of these models and their influence are growing faster than our understand of them, and our ability to steer them to behave precisely how we want, and that will only get harder as the models get more powerful.

> At this point, the balance of power will be critical to AI safety. I think it will be better to live in a world where AI is widely deployed so that larger actors can check the power of smaller bad actors.

This paragraph ignores the concept of offense/defense balance. It's much easier to cause a pandemic than to stop one, and cyberattacks, while not as bad as pandemics, seem to also favor the attacker (this one is contingent on how much AI tools can improve our ability to write secure code). At the extreme, it would clearly be bad if everyone had access to a anti-matter weapon large enough to destroy the Earth; at some level of capability, we have to limit the commands an advanced AI will follow from an arbitrary person.

That said, I'm unsure if limiting public weights at this time would be good regulation. They do seem to have some benefits in increasing research around alignment/interpretability, and I don't know if I buy the argument that public weights are significantly more dangerous from a "misaligned ASI" perspective than many competing closed companies. I also don't buy the view of some in the leading labs that we'll likely have "human level" systems by the end of the decade; it seems possible but unlikely. But I worry that Zuckerberg's vision of the future does not adequately guard against downside risks, and is not compatible with the way the technology will actually develop.

[1] https://thebulletin.org/2024/06/california-ai-bill-becomes-a...

[2] https://time.com/6288245/openai-eu-lobbying-ai-act/

tpurves1y ago

405 is a lot of B's. What does it take to run or host that?

danielmarkbruce1y ago

quantize to 0 bit. Run on a potato.

Jokes aside ~ 405b x 2 bytes of memory (FP16), so say 810 gigs, maybe 1000 gigs or so required in reality, need maybe 2 aws p5 instances?

mensetmanusman1y ago

It’s easy to support open source AI when the code is 1,000 lines and the execution costs $100,000,000 of electricity.

Only the big players can afford to push go, and FB would love to see OpenAI’s code so they can point it to their proprietary user data.

Bluescreenbuddy1y ago

>This is how we’ve managed security on our social networks – our more robust AI systems identify and stop threats from less sophisticated actors who often use smaller scale AI systems.

So about all the bots and sock puppets on social media..

zelphirkalt1y ago

Open data for open algo for open AI is the path forward.

arisAlexis1y ago

Zuckerberg and LeCunn put humans at great risk

Purplehermann1y ago

Well that's it then, we're gonna die

bentice1y ago

Ironically, this benefits Apple so much.

netsec_burn1y ago

How? They are prohibited from using it in the license.

dcist1y ago

So commoditize the complement.

OriginalMrPink1y ago

Open Source AI is the path forward, but I have hard time believing that Meta should be affiliated with it.

tomjen31y ago

Re safety: Just release two models, one that has been tuned and one that hasn't.

Claude is supposed to be better, but it is also even more locked down than ChatGPT.

Word will let me write a manifest for a new Nazi party, but Claude is so locked down that it won't find a cartoon in a picture and Gemini... well.

If AIs are not to harm society, they need to enable us to think in new ways.

avereveard1y ago

they said, while not releasing the video part of chamleon model

LarsDu881y ago

Obligatory reminder of why tech companies subsidize open source projects: https://www.joelonsoftware.com/2002/06/12/strategy-letter-v/

smashah1y ago

Actually open source Whatsapp is the way forward.

gooob1y ago

why do they keep training on publicly available online data, god dammit? what the fuck. don't they want to make a good LLM? train on the classics, on the essentials reference manuals for different technologies, on history books, medical encyclopedias, journal notes from the top surgeons and engineers, scientific papers of the experiments that back up our fundamental theories. we want quality information, not recent information. we already have plenty of recent information.

pandaswo1y ago

Way to go!

petetnt1y ago

Great, now release the datasets used for training your AI so everyone can get compensated accordingly and ask that your competition follow suit.

cratermoon1y ago

You first, Zuck.

ysofunny1y ago

or else is not even scientific

manishrana1y ago

rally useful insights

1024core1y ago

"open source AI" ... "open" ... "open" ....

And you can't even try it without an FB/IG account.

Zuck will never change.

causal1y ago

I think you can use an HF account as well https://huggingface.co/meta-llama

Gracana1y ago

You can also wait a bit for someone to upload quantized variants, finetunes, etc, and download those. FWIW I'm not making a claim about the legality of that, just saying it's an easy way around needing to sign the agreement.

CamperBob21y ago

It doesn't require an account. You do have to fill in your name and email (and birthdate, although it seems to accept whatever you feed it.)

animanoir1y ago

"Says the Meta Inc".

diggan1y ago

> Today we’re taking the next steps towards open source AI becoming the industry standard. We’re releasing Llama 3.1 405B, the first frontier-level open source AI model,

Why do people keep mislabeling this as Open Source? The whole point of calling something Open Source is that the "magic sauce" of how to build something is publicly available, so I could built it myself if I have the means. But without the training data publicly available, could I train Llama 3.1 if I had the means? No wonder Zuckerberg doesn't start with defining what Open Source actually means, as then the blogpost would have lost all meaning from the get go.

Just call it "Open Model" or something. As it stands right now, the meaning of Open Source is being diluted by all these companies pretending to doing one thing, while actually doing something else.

I initially got very exciting seeing the title and the domain, but hopelessly sad after reading through the article and realizing they're still trying to pass their artifacts off as Open Source projects.

jdminhbg1y ago

> Why do people keep mislabeling this as Open Source? The whole point of calling something Open Source is that the "magic sauce" of how to build something is publicly available, so I could built it myself if I have the means. But without the training data publicly available, could I train Llama 3.1 if I had the means?

I don't think not releasing the commit history of a project makes it not Open Source, this seems like that to me. What's important is you can download it, run it, modify it, and re-release it. Being able to see how the sausage was made would be interesting, but I don't think Meta have to show their training data any more than they are obligated to release their planning meeting notes for React development.

Edit: I think the restrictions in the license itself are good cause for saying it shouldn't be called Open Source, fwiw.

diggan1y ago

> I don't think not releasing the commit history of a project makes it not Open Source,

Right, I'm not talking about the commit history, but rather that anyone (with means) should be able to produce the final artifact themselves, if they want. For weights like this, that requires at least the training script + the training data. Without that, it's very misleading to call the project Open Source, when only the result of the training is released.

> What's important is you can download it, run it, modify it, and re-release it

But I literally cannot download the project, build it and run it myself? I can only use the binaries (weights) provided by Meta. No one can modify how the artifact is produced, only modify the already produced artifact.

That's like saying that Slack is Open Source because if I want to, I could patch the binary with a hex editor and add/remove things as I see fit? No one believes Slack should be called Open Source for that.

1 more reply

thenoblesunfish1y ago

You don't need to have the commit history to see "how it works". ML that works well does so in huge part due to the training data used. The leading models today aren't distinguished by the way they're trained, but what they're trained on.

1 more reply

tempfile1y ago

For the freedom to change to be effective, a user must be given the software in a form they can modify. Can you tweak an LLM once it's built? (I genuinely don't know the answer)

1 more reply

valine1y ago

The codebase to do the training is way less valuable than the weights for the vast majority of people. Releasing the training code would be nice, but it doesn't really help anyone but Meta's direct competitors.

If you want to train on top of Llama there's absolutely nothing stopping you. Plenty of open source tools to do parameter optimization.

diggan1y ago

Not just the training code but the training data as well, should be under a permissive license, otherwise you cannot call the project itself Open Source, which Facebook does here.

> is way less valuable than the weights for the vast majority of people

The same is true for most Open Source projects, most people use the distributed binaries or other artifacts from the projects, and couldn't care less about the code itself. But that doesn't warrant us changing the meaning of Open Source just because companies feel like it's free PR.

> If you want to train on top of Llama there's absolutely nothing stopping you.

Sure, but in order for the intent of Open Source to be true for Llama, I should be able to build this project from scratch. Say I have a farm of 100 A100's, could I reproduce the Llama model from scratch today?

3 more replies

elromulous1y ago

100%. With this licensing model, meta gets to reap the benefits of open source (people contributing, social cachet), without any of the real detriment (exposing secret sauce).

hbn1y ago

Is that even something they keep on hand? Or would WANT to keep on hand? I figured they're basically sending a crawler to go nuts reading things and discard the data once they've trained on it.

If that included, e.g. reading all of Github for code, I wouldn't expect them to host an entire separate read-only copy of Github because they trained on it and say "this is part of our open source model"

vngzs1y ago

Agreed. The Linux kernel source contains everything you need to produce Linux kernel binaries. The llama source does not contain what you need to produce llama models. Facebook is using sleight of hand to garner favor with open model weights.

Open model weights are still commendable, but it's a far cry from open-source (or even libre) software!

unraveller1y ago

Open-weights is not open-source, for sure, but I don't mind it being stated as an aspiration goal, the moment it is legally possible to publish a source without shooting themselves in the foot they should do it.

They could release 50% of their best data but that would only stop them from attracting the best talent.

blcknight1y ago

InstructLab and the Granite Models from IBM seem the closest to being open source. Certainly more than whatever FB is doing here.

(Disclaimer: I work for an IBM subsidiary but not on any of these products)

JeremyNT1y ago

> Why do people keep mislabeling this as Open Source?

I guess this is a rhetorical question, but this is a press release from Meta itself. It's just a marketing ploy, of course.

amelius1y ago

> One of my [Mark Zuckerberg, ed.] formative experiences has been building our services constrained by what Apple will let us build on their platforms. Between the way they tax developers, the arbitrary rules they apply, and all the product innovations they block from shipping, it’s clear that Meta and many other companies would be freed up to build much better services for people if we could build the best versions of our products and competitors were not able to constrain what we could build.

This is hard to disagree with.

glhaynes1y ago

I think it's very easy to disagree with!

If Zuckerberg had his way, mobile device OSes would let Meta ingest microphone and GPS data 24/7 (just like much of the general public already thinks they do because of the effectiveness of the other sorts of tracking they are able to do).

There are certainly legit innovations that haven't shipped because gatekeepers don't allow them. But there've been lots of harmful "innovations" blocked, too.

mvkel1y ago

It's a real shame that we're still calling Llama "open source" when at best it's "open weights."

Not that anyone would go buy 100,000 H100s to train their own Llama, but words matter. Definitions matter.

lolinder1y ago

Source versus weights seems like a really pedantic distinction to make. As you say, the training code and training data would be worthless to anyone who doesn't have compute on the level that Meta does. Arguably, the weights are source code interpreted by an inference engine, and realistically it's the weights that someone is going to want to modify through fine-tuning, not the original training code and data.

The far more important distinction is "open" versus "not open", and I disagree that we should cede that distinction while trying to fight for "source". The Llama license is restrictive in a number of ways (it incorporates an entire acceptable use policy) that make it most definitely not "open" in the customary sense.

JamesBarney1y ago

https://llama.meta.com/llama3_1/use-policy/

The acceptable use policy is seems fine. Don't use it to break the law, solicit sex, kill people, or lie.

2 more replies

frabcus1y ago

Meta could change the license of future releases of Llama and kill your business built on it.

If the training data was openly available, even if you can't afford to res train a new version, a competitor like Amazon could do it for you

1 more reply

mvkel1y ago

> training code and training data would be worthless to anyone who doesn't have compute on the level that Meta does

I don't fully agree.

Isn't that like saying *nix being open source is worthless unless you're planning to ship your own Linux distro?

Knowing how the sausage is made is important if you're an animal rights activist.

sidcool1y ago

Honest question. As far as LLMs are concerned, isn't open weights same as open source?

paulhilbert1y ago

No, I would argue that from the three main ingredients - training data, model source code and weights - weights are the furthest away from something akin to source code.

They're more like obfuscated binaries. When it comes to fine-tuning only however things shift a little bit, yes.

1 more reply

aloe_falsa1y ago

GPL defines the “source code” of a work as the preferred form of the work for making modifications to it. If Meta released a petabyte of raw training data, would that really be easier to extend and adapt (as opposed to fine-tuning the weights)?

blackeyeblitzar1y ago

No open weights are the output of a proprietary and secretive process of training. It’s like sharing a pre compiled application instead of what you need to reproduce the compiled application.

AI2’s OLMo is an example of what open source actually looks like for LLMs:

https://blog.allenai.org/hello-olmo-a-truly-open-llm-43f7e73...

mesebrec1y ago

Open source requires, at the very least, that you can use it for any purpose. This is not the case with Llama.

The Llama license has a lot of restrictions, based on user base size, type of use, etc.

For example you're not allowed to use Llama to train or improve other models.

But it goes much further than that. The government of India can't use Llama because they're too large. Sex workers are not allowed to use Llama due to the acceptable use policy of the license. Then there is also the vague language probibiting discrimination, racism etc.. good luck getting something like that approved by your legal team.

jameson1y ago

It's hard to say Llama is an "open source" when their license states Meta has full control under certain circumstances

https://raw.githubusercontent.com/meta-llama/llama-models/ma...

> 2. Additional Commercial Terms. If, on the Llama 3.1 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights.

__loam1y ago

It should be transparently clear that this move was taken by Meta to drive their competitors out of business in a capital intensive space.

1 more reply

systemvoltage1y ago

Tbh, it’s incredibly generous.

nuz1y ago

Everyone complaining about not having data access: Remember that without meta you would have openai and anthropic and that's it. I'm really thankful they're releasing this, and the reason they can't release the data is obvious.

mesebrec1y ago

Without Meta, you would still have Mistral, Silo AI, and the many other companies and labs producing much more open models with similar performance.

enriquto1y ago

It's alarming that he refers to llama as if it was open source.

The definition of free software (and open source, for that mater), is well-established. The same definition applies to all programs, whether they are "AI" or not. In any case, if a program was built by training against a dataset, the whole dataset is part of the source code.

Llama is distributed in binary form, and it was built based on a secret dataset. Referring to it as "open source" is not ignorance, it's malice.

Nesco1y ago

The training data contains most likely insane amounts of copyrighted material. That’s why virtually none of the “open models” come with their training data

enriquto1y ago

> The training data contains most likely insane amounts of copyrighted material.

If that is the case then the weights must inherit all these copyrights. It has been shown (at least in image processing) that you can extract many training images from the weights, almost verbatim. Hiding the training data does not solve this issue.

But regardless of copyright issues, people here are complaining about the malicious use of the term "open source", to signify a completely different thing (more like "open api").

1 more reply

jdminhbg1y ago

> In any case, if a program was built by training against a dataset, the whole dataset is part of the source code.

I'm not sure why I keep seeing this. What is the equivalent of the training data for something like the Linux kernel?

enriquto1y ago

> What is the equivalent of the training data for something like the Linux kernel?

It's the source code.

For the linux kernel:

          compile(sourcecode) = binary

For llama:

          train(data) = weights

1 more reply

rednafi1y ago

How’s only sharing the binary artifact is open source? There’s the data aspect of things that they can’t share because of licensing and the code itself isn’t accessible.

Palmik1y ago

It's much better than sharing a binary artifact of regular software, since the weights can be and are easily and frequently modified by fine tuning the model. This means you can modify the "binary artifact" to your needs, similar to how you might change the code of open source software to add features etc.

jamiedg1y ago

Looks like it's easy to test out these models now on Together AI - https://api.together.ai

1 more reply

KingOfCoders1y ago

Open Source AI needs to include training data.

1 more reply

ldjkfkdsjnv1y ago

On every post like this on hacker news, I post my pro meta speech. My opinion has always been the same:

1. Meta pushed engineering wages higher across the industry.

2. They promote high performing engineers very quickly. There are engineers making 7 figures there with just a few years experience.

3. They have open sourced the most important frameworks: React and Pytorch

This company is a guiding light forcing the hand of other large corporations. Mark Zuckerberg is a hero, and has done a fantastic job

dang1y ago

Ok, but please don't post repetitive comments or recite speeches on HN. We're trying to avoid that kind of thing here, because repetition is tedious and bad for curiosity.

https://hn.algolia.com/?dateRange=all&page=0&prefix=false&so...

https://news.ycombinator.com/newsguidelines.html

LorenDB1y ago

I never really liked Zuck due to Facebook's massive black hole of user tracking, but you've gotta give it to him: the Llama models are insanely good, and the commitment to making Horizon OS available to other VR manufacturers is commendable, although I'd like to see it truly open sourced.

manishrana1y ago

really useful insights

maxdo1y ago

so that North Korea will create small call centers for cheaper, since they can get these models for free?

HanClinto1y ago

The article argues that the threat of foreign espionage is not solved by closing models.

> Some people argue that we must close our models to prevent China from gaining access to them, but my view is that this will not work and will only disadvantage the US and its allies. Our adversaries are great at espionage, stealing models that fit on a thumb drive is relatively easy, and most tech companies are far from operating in a way that would make this more difficult. It seems most likely that a world of only closed models results in a small number of big companies plus our geopolitical adversaries having access to leading models, while startups, universities, and small businesses miss out on opportunities.

mrfinn1y ago

You guys really need to get over your bellicose POV of the world. Actually, before it destroys you. Really, is not necessary. Most people in the world just want to leave in peace, and see their children grow happily. For each data center NK would create there will be a thousand of peaceful, kind, and well-intentioned AI projects going on. Or maybe more.

tempfile1y ago

This argument implies that cheap phones are bad since telemarketers can use them.

seydor1y ago

That assumes LLMs are the path to AI, which is increasingly becoming an unpopular opinion

codedokode1y ago

AI should not be open source because it can be used in military applications. It doesn't make sense to give away a technology others might use against you.

probablybetter1y ago

I would avoid Facebook and Meta products in general. I do NOT trust them. We have approx. 20 years of their record to go upon.

didip1y ago

Is it really open source though? You can't run these models for your company. The license is extremely restrictive and there's NO SOURCE CODE.

chx1y ago

A total ban on generative AI is the way forward. If the industry refuses to make it safe by self regulating then the regulator must step in and ban it until better, more fine tuned regulation can be made. It is needed to protect our environment, our democracy, our very lives.

yard20101y ago

Isn't it like banning Christianity? I don't think it can be done

bufferoverflow1y ago

Hard disagree. So far every big important model is closed-source. Grok is sort-of the only exception, and it's not even that big compared to the (already old) GPT-4.

I don't see open source being able to compete with the cutting-edge proprietary models. There's just not enough money. GPT-5 will take an estimated $1.2 billion to train. MS and OpenAI are already talking about building a $100 billion training data center.

How can you compete with that if your plan is to give away the training result for free?

sohamgovande1y ago

Where is the $1.2b number from?

bufferoverflow1y ago

There are a few numbers floating around, $1.2B being the lowest estimate.

HSBC estimates the training cost for GPT-5 between $1.7B and $2.5B.

Vlad Bastion Research estimates $1.25B - 2.25B.

Some people on HN estimate $10B:

https://news.ycombinator.com/item?id=39860293

abss1y ago

Interesting, but we have to consider this information with skepticism since it comes from Meta. Additionally, merely open-sourcing models is insufficient; the training data must also be accessible to verify the outcomes. Furthermore, tools and applications must be freely deployable and capable of storing and sharing data under our personal control. Self-promotion: We have initiated experiments for an AI-based operating system, check AssistOS.org. We recently received a European research grant to support the improvement of AssistOS components. Contact us if you find our work interesting, wish to contribute, conduct research with us, or want to build an application for AssistOS.

tw041y ago

>In the early days of high-performance computing, the major tech companies of the day each invested heavily in developing their own closed source versions of Unix.

Because they sold the resultant code and systems built on it for money... this is the gold miner saying that all shovels and jeans should be free.

Am I happy Facebook open sources some of their code? Sure, I think it's good for everyone. Do I think they're talking out of both sides of their mouth? Absolutely.

Let me know when Facebook opens up the entirety of their Ad and Tracking platforms and we can start talking about how it's silly for companies to keep software closed.

I can say with 100% confidence if Facebook were selling their AI advances instead of selling the output it produces, they wouldn't be advocating for everyone else to open source their stacks.

JumpCrisscross1y ago

> if Facebook were selling their AI advances instead of selling the output it produces, they wouldn't be advocating for everyone else to open source their stack

You're acting as if commoditizing one's complements is either new or reprehensible [1].

[1] https://gwern.net/complement

tw041y ago

>You're acting as if commoditizing one's complements is either new or reprehensible [1].

I'm acting as if calling on other companies to open source their core product, just because it's a complement for you, and acting as if it's for the benefit of mankind is disingenuous, which it is.

2 more replies

rvnx1y ago

The source-code to Ad tracking platform is useless to users.

At the end, it's actually Facebook doing the right thing (though they are known for being evil).

It's a bit of an irony.

The supposedly "good" and "open" people like Google or OpenAI, haven't given their model weights.

A bit like Microsoft became the company that actually supports the whole open-source ecosystem with GitHub.

tw041y ago

>The source-code to Ad tracking platform is useless to users.

It's absolutely not useless for developers looking to build a competing project.

>The supposedly "good" and "open" people like Google or OpenAI, haven't given their model weights.

Because they're monetizing it... the only reason Facebook is giving it away is because it's a complement to their core product of selling ads. If they were monetizing it, it would be closed source. Just like their Ads platform...

mesebrec1y ago

Note that Meta's models are not open source in any interpretation of the term.

* You can't use them for any purpose. For example, the license prohibits using these models to train other models. * You can't meaningfully modify them given there is almost no information available about the training data, how they were trained, or how the training data was processed.

As such, the model itself is not available under an open source license and the AI does not comply with the "open source AI" definition by OSI.

It's an utter disgrace for Meta to write such a blogpost patting themselves on the back while lying about how open these models are.

causal1y ago

You are definitely allowed to train other models with these models, you just have to give credit in the name, per the license:

> If you use the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama” at the beginning of any such AI model name.

mesebrec1y ago

Indeed, this is something they changed in the 3.1 version of the license.

Regardless, the license [1] still has many restrictions, such as the acceptable use policy [2].

[1] https://huggingface.co/meta-llama/Meta-Llama-3.1-8B/blob/mai...

[2] https://llama.meta.com/llama3_1/use-policy

ChadNauseam1y ago

> you can't meaningfully modify them given there is almost no information available about the training data, how they were trained, or how the training data was processed.

I was under the impression that you could still fine-tune the models or apply your own RLHF on top of them. My understanding is that the training data would mostly be useful for training the model yourself from scratch (possibly after modifying the training data), which would be extremely expensive and out of reach for most people

chasd001y ago

From what i understand the training data and careful curation of it is the hard part. Everyone wants training data sets to train their own models instead of producing their own.

mesebrec1y ago

Indeed, fine-tuning is still possible, but you can only go so far with fine-tuning before you need to completely retrain the model.

This is why Silo AI, for example, had to start from scratch to get better support for small European languages.

j / k navigate · click thread line to collapse

Open source AI is the path forward (opens in new tab)

887 comments