AlphaFold 3 Code (opens in new tab)

(github.com)

137 pointsMurizS1y ago23 comments

23 comments

12 comments · 4 top-level

sigmar1y ago· 7 in thread

Correct me if I'm wrong, but we have no recent and explicit US gov't guidance on whether these model weights are copyrightable. Copyright office has said ai-generated outputs are not copyrightable[1], but hasn't weighed in on weights(?) Kind of seems like that should change?

Wasn't a relevant question for AlphaFold2, as the weights for it were CC BY 4.0 license.

These model weights (and many other ml weights) are clearly very useful in a commercial settings, but google thinks it can scare people into not using them with the wording of their license, are they right?

[1] https://libanswers.baylor.edu/faq/409539

fasa991y ago

It's deeper than just weights because the topic is biological.

In the old days of genomics there were massive patent wars. First, the human genome project itself. Craig Venter got massive funding to sequence the human genome with the understanding he'd patent all the genes. So there was a space race of sorts where the private sector sought to beat him - lead by Francis Collins now head of the NIH. It came out a tie (or that's what they called it), Bill Clinton brought them both on a stage and said "great job! also genes aren't patentable!"

Then a whole stink arose around Myriad Genetics who patented a BRCA test. Now that's a bigtime gene far as cancer goes see: Angelina Jolie. Then in 2013 the supreme court ruled genes cannot be patented.

So what is alphafold 3? Is it a ground truth of which protein interacts with what? In which case it seems not patentable. Or is it a method, or algorithm, to estimate protein interactions? That's more grey area. Idk. If google wanted to monetize it proper they'd probably keep it as an internal black project and cook up pharma collabs and such. But they've made it public(ish). Still a long way to go, or at least some more steps. If we say Protein A interacts with Protein B, we then have to ask whether they're expressed in the same cell, which itself is not enough! Most bio measurements are in big batches of millions of cells. It has to be same cell at the same time. So if our batch is a million cells w/ protein A, a million cells w/ protein B, then it looks like both are "on" in our batch of 2 million cells. But the truth is more nuanced. And then even then, other considerations such as post translational modifications and which cellular compartment these proteins reside in.

sebzim45001y ago

The weights are only provided on demand, presumably only after you seen a bunch of agreements. I don't think copyright matters.

Hizonner1y ago

That has basically zero practical effect. If Google hands out the weights to thousands of people, and even one of them leaks them to somebody who hasn't "seen a bunch of agreements", then Google's only protection against further redistribution is copyright. Which doesn't exist.

Yes, they can come after the leaker. If they can identify the leaker, which they probably can't. But even crucifying the leaker won't put the genie back into the bottle.

2 more replies

sigmar1y ago

Yeah, google can control initial distribution. but in the long term, I believe Google's ability to enforce that Terms of Use (re: no commercial use) is entirely dependent on whether they have IP ownership of the weights

scotty791y ago

Can results of mathematical computation without any artist's control be copyrigtable?

wrsh071y ago

While not directly analogous, people often think "it's just math" or "it's just numbers" and therefore claim their use is ok. I would encourage those people to read about illegal numbers: https://en.m.wikipedia.org/wiki/Illegal_number

1 more reply

techjamie1y ago

It hasn't been weighed on, but as someone with no legal credentials, it wouldn't surprise me if the ultimate answer on models being copyrightable is "No."

Ultimately, the working parts of a given model are completely unknowable to even the smartest humans once you get to doing anything past bare basics. We know the shape of the model, the number of layers, and what inputs/outputs correlate to, but not really anything else. It's the product of a machine trying things randomly until something works, then the best model produced is selected for production.

Not altogether different on a high level perspective from generating an image, or piece of text using a model. You're introducing a random factor, number of steps, and the machine uses this unknowable model to produce something a person can understand.

I do think the law should update and grant some protections to people who produce models, because losing all protection would mean the death of open model releases, and then we'd be even more seriously staring down the barrel of corpos controlling the entirety of the technology moreso than we are now. At least open models provide some semblance of control for end users.

1 more reply

vlovich1231y ago· 1 in thread

What inference framework is this using to accelerate the math? I only found references to numpy and Jax but this doesn’t seem to use tensorflow?

sakras1y ago

Yep it just uses Jax to do the inference, no tensorflow needed.

djoldman1y ago

It seems you can get the parameters (weights). They are subject to a license agreement:

> 3. Use Restrictions

> You must not use any of the AlphaFold 3 Assets:

> 1. for the restricted uses set forth in the AlphaFold 3 Model Parameters Prohibited Use Policy; or

> 2. in violation of applicable laws and regulations.

AlphaFold 3 Model Parameters Prohibited Use Policy states:

> You must not access or use nor allow others to access or use the the AlphaFold 3 Assets:

> On behalf of a commercial organization or in connection with any commercial activities, including research on behalf of commercial organizations.

1 more reply

73737373731y ago

I wonder if AlphaProof will be ever released

j / k navigate · click thread line to collapse

23 comments

12 comments · 4 top-level

sigmar1y ago· 7 in thread

Wasn't a relevant question for AlphaFold2, as the weights for it were CC BY 4.0 license.

[1] https://libanswers.baylor.edu/faq/409539

fasa991y ago

It's deeper than just weights because the topic is biological.

sebzim45001y ago

The weights are only provided on demand, presumably only after you seen a bunch of agreements. I don't think copyright matters.

Hizonner1y ago

Yes, they can come after the leaker. If they can identify the leaker, which they probably can't. But even crucifying the leaker won't put the genie back into the bottle.

2 more replies

sigmar1y ago

scotty791y ago

Can results of mathematical computation without any artist's control be copyrigtable?

wrsh071y ago

1 more reply

techjamie1y ago

It hasn't been weighed on, but as someone with no legal credentials, it wouldn't surprise me if the ultimate answer on models being copyrightable is "No."

1 more reply

vlovich1231y ago· 1 in thread

What inference framework is this using to accelerate the math? I only found references to numpy and Jax but this doesn’t seem to use tensorflow?

sakras1y ago

Yep it just uses Jax to do the inference, no tensorflow needed.

djoldman1y ago

It seems you can get the parameters (weights). They are subject to a license agreement:

> 3. Use Restrictions

> You must not use any of the AlphaFold 3 Assets:

> 1. for the restricted uses set forth in the AlphaFold 3 Model Parameters Prohibited Use Policy; or

> 2. in violation of applicable laws and regulations.

AlphaFold 3 Model Parameters Prohibited Use Policy states:

> You must not access or use nor allow others to access or use the the AlphaFold 3 Assets:

> On behalf of a commercial organization or in connection with any commercial activities, including research on behalf of commercial organizations.

1 more reply

73737373731y ago

I wonder if AlphaProof will be ever released

j / k navigate · click thread line to collapse