Decentralized Artificial Intelligence (opens in new tab)

(chaos-engineering.dev)

87 pointsliqudity2y ago43 comments

43 comments

32 comments · 15 top-level

lappa2y ago· 6 in thread

This seems half-baked and there are numerous faulty assumptions in this article. For example, Bitcoin miners cannot computer gradients. Their ASICs can only calculate double-sha256.

Additionally, the premise of sending gradients of models trained on private data while retaining privacy seems problematic. While you likely can't reverse it to calculate the batch's contents, it is leaky.

Further, gradient calculations are not a good proof of work. A good proof of work is difficult to calculate and easy to verify (i.e. an extremely low hash value).

The core premise, using blockchains to store terabytes of models and datasets, doesn't make any sense whatsoever.

The problems highlighted in this article however are valid, and it would be great to see something like IPFS for datasets and models.

jerpint2y ago

This was my reaction too. How do you “prove” my gradient is valid?

Well perhaps one way is you could have another LLM take a look at the data you are submitting and have it predict p(useful|not useful) , and create an incentive for users to generate authentic data

synctext2y ago

Indeed very half-baked.

This must be quite an insult to the inventor of fully decentralised AI. In 2012 Prof. Jelasity invented gradient mining with SGD, https://arxiv.org/pdf/1109.1396 Works reasonably well, you probably also need a ledger like Trustchain for accountability and Sybil protection. Consensus is too costly. This problem is unsolved and actively worked on academia for decades.

riku_iki2y ago

> This was my reaction too. How do you “prove” my gradient is valid?

you can randomly ask sometimes 4 different nodes to calculate the same gradient, and see who is cheeting.

m00dy2y ago

Ian (Goodfellow), is that you ?

1 more reply

3cats-in-a-coat2y ago

The saying "blockchains are a solution in search of a problem" gets funnier every day.

tromp2y ago

> A good proof of work is difficult to calculate and easy to verify (i.e. an extremely low hash value).

That's a specific Proof-of-Work system known as Hashcash [1]. There are many others, such as Cuckoo Cycle, in which the solution is a fixed-length cycle in a random bipartite graph [2].

[1] https://en.wikipedia.org/wiki/Hashcash

[2] http://cryptorials.io/beyond-hashcash-proof-work-theres-mini...

arisAlexis2y ago· 5 in thread

Chaos engineering may sound cool and cyberpunky but it'd not what we need right now for AI safety. Unless you guys want to have your kids tortured by moloch

w4ffl352y ago

AI safety is a joke.

eddmakesio2y ago

A way for the humanities to get some of the CS STEM funding by writing sci-fi.

Edit: The number of AI safety sessions I’ve joined where the speakers have no real AI experience talking about potentially bad futures, based on zero CS experience and little ‘evidence’ beyond existing sci-fi books and anecdotes, have left me very jaded on the subject as a ‘discipline’.

I believe it comes down to three groups:

AI researchers/organisations wanting to make their work sound very important/scary

Humanities researchers wanting STEM funding

AI organisations trying to bring in legislation to slow down competition

3 more replies

owlbite2y ago

AI safety for a lot of the movie-style concerns is pretty easy: don't connect it to the f*king nuclear launch system you moron. Ensure that any output on the launch decision path is checked by a human that actually engages their brain.

(Alas if you wanted to avoid "killer robots" you're already too late, see that whole episode where a certain large military power decided that target selection of cellphones to kill due to "metadata" could be achieved by a system not unkind to the facebook algorithm, with the only safeguard some wet-behind-the-ears kid who's piloting the drone from half way around the world and a "computer says die" attitude).

jgalt2122y ago

Indeed. It makes sociology look like a rigorous discipline.

arisAlexis2y ago

most of the most prominent AI researchers including 2 out of 3 Turing award winners + Ilya Sutskever are very very sold on AI safety. Imagine being a rando on the internet calling their research a joke. Imagine the audacity and ego. Human ego is such a weird thing.

1 more reply

bastawhiz2y ago· 4 in thread

> a cryptographically secure, decentralized ledger is the only solution to making AI safer.

This article manages to say that a consensus algorithm is the answer to problems with AI without using the word "consensus" a single time. Probably because it doesn't even consider that's what blockchains are.

The article talks about using "proof of gradient" to do inference instead of crunching hashes. But this is nonsense, because inference takes inputs and produces a deterministic output. There's no mining. Checking the work takes the same resources as doing the work. Proof of work output can be checked with essentially a single hash.

As much as this would be wonderful in a universe where it's possible, it's simply not possible. The author throws out a bunch of buzz words for things that sound similar in AI and crypto and tries to make them sound like they are interchangeable. They're not.

m3kw92y ago

It’s standard for white papers to write equation and theories that takes effort to prove wrong and it usually is(get the lazy guys to invest), even just by skimming the few parts that the analogy to finding the right hashes is to calculate gradients does not make any sense, I just stop there.

jasonmorton2y ago

> Checking the work takes the same resources as doing the work. > As much as this would be wonderful in a universe where it's possible, it's simply not possible.

We do live in that universe, under some currently believed assumptions. An NP-complete problem is an example of something where checking a solution is (thought to be) easier than finding one.

Zero-knowledge proofs make it such that checking that a computation (such as inference) has been done correctly is easier than doing the computation (even keeping some parts private). A great reference is here: https://people.cs.georgetown.edu/jthaler/ProofsArgsAndZK.pdf

visarga2y ago

You can run the same gradient step on multiple nodes and only credit them when they agree.

bastawhiz2y ago

That's pointless work because there's no incentive to do the "mining" (because nobody "wins" ). If you implement a system to choose a "winner" then you didn't need the computations in the first place because you've invented proof of stake.

1 more reply

injeolmi_love2y ago· 1 in thread

If the problem is version control, why not use git? It’s also easy to fork, and it’s much cheaper than a blockchain.

carom2y ago

Git does not work well for binary blobs. Consider how you review a pull request.

23584522y ago· 1 in thread

Do you want Terminator? Because this is how you get Terminator. A blockchain-based distributed giant AI could go rogue and it would be very difficult to stop it.

jazzyjackson2y ago

Wouldn't it be reliant on the nodes of the blockchain continuing to run it?

And if the crypto-economy is doing well enough that node-runners are economically incentivized to continue running, then is the rogue AI even that harmful?

paulsutter2y ago

Blockchains are designed to avoid double spending in a low trust environment.

Double spending isn't one of the problems in AI. This article is absolute nonsense

> a cryptographically secure, decentralized ledger is the only solution to making AI safer

jasonmorton2y ago

One can cryptographically prove the correct inference of a small AI model now with https://github.com/zkonduit/ezkl (our open-source package).

api2y ago

Very important topic, but this is a bad article whose authors have their heads too far into cryptocurrency which IMHO is a mostly failed approach to decentralization.

The decentralized AI that is working is just sharing models. Splitting the inference across the net for cooperative execution is possible.

The thing that really needs to be solved is distributed training. We need to be able to train base models in a manner more like Folding@Home so we’re not captive to organizations with big expensive GPU farms to make any innovation.

damc42y ago

In order to solve inequality as a result of AI, you don't need to decentralize AI, you can just decentralize the government of AI (and the AI remains centralized). And of course people who would work with that AI would be subject to the law, so if they did a different thing that people democratically decided, they would be penalized.

And that is safer because if everyone has access to the model, then people can use it in unsafe way. And it would be much more difficult to enforce the laws, when you have to control what everyone in the world does, than just one or few entities (companies, organizations, governments, whatever...). Unless it talks about decentralizing it in such way that different nodes store different parts of the model, instead of all nodes having the entire model (in that case, it's fine).

kordlessagain2y ago

It's valuable to examine the challenges in machine learning without assuming decentralization as a solution:

> High Cost and Resource Requirements

For training and local inferencing use, quantization may help. Problem becomes local via quantization vs. remote full tensor use. Solution may involve distributed inferencing. Techniques like model distillation can help create smaller, more efficient models for inferencing.

> Data Privacy

For training, some private datasets may be needed. For local inferencing use, determining what needs to be inferenced locally vs. what needs to be run remotely may be useful. Problem becomes privacy scope mapped onto a marketplace to mitigate high cost and resource requirements. Techniques like model explainability (versioning) and robustness testing can help build trust in AI systems.

Complying with data privacy regulations and ensuring that AI systems adhere to legal and ethical standards can be a challenge, especially in international contexts.

> Incentives

Instead of assuming the solution when considering the problem, we assume there is an incentive to either simply train a model or use one. Problem becomes financial rewards, data access agreements, or even altruistic motivations.

> Stale Data and Reproducibility

Both the code and datasets for training the model need to be updated. Inferencing needs RAG, so the augmented reference data needs to be updated as well. Anything updated might need some type of revision control, especially if that data (or code) results in poor output. Labeling data and knowledge transfer are other problems that needs revision control.

> Interoperability

We can assume a marketplace for a ML train/inference platform is needed. We have HuggingFace, for example. The problem here is likely based on the tendency for datasets to be private, such as in the case of Llama 2. Models contain the "essence" of the dataset, but we still need RAG to ground the responses.

The use of the Lightning Network combined with a proposed 402 response code is an interesting concept for addressing some of these challenges: https://github.com/lightninglabs/aperture

It could provide a decentralized and efficient way to facilitate payments for dataset access, training, and inferencing to incentivize data sharing and model usage.

nologic012y ago

> All of this may sound a little ridiculous but it’s not.

Actually it is. But in perverse way it isnt.

"AI" is currently being developed and commercialized in what is essentially an old-fashioned way, with a whiff of theft of intellectual property of the training material never too far.

Decentralized AI is a distinct destination but the technologies that might realize it need not have anything remotely related to blockchain (a solution that was conceived for a different - perceived - problem, the centralization of money creation)

amelius2y ago

> The academic literature is ridden with examples of state of the art (SOTA) models that weren’t reproducible

Probably too late, but Elsevier could fix its bad reputation if they required relevant academic papers to submit reproducible code and subjected them to automated testing.

1 more reply

byteware2y ago

although zero knowledge proofs can compress the "proof of gradient" so it takes much less time to verify than recompute, it take orders of more work to generate the proof, moreover can we merge multiple "improvements"? if not then upon the selection of the "winner" all those are lost?

CatWChainsaw2y ago

One buzzword just wasn't enough, was it...

tacone2y ago

Next up: the AIs of the world should unionize /s

j / k navigate · click thread line to collapse

43 comments

32 comments · 15 top-level

lappa2y ago· 6 in thread

This seems half-baked and there are numerous faulty assumptions in this article. For example, Bitcoin miners cannot computer gradients. Their ASICs can only calculate double-sha256.

Further, gradient calculations are not a good proof of work. A good proof of work is difficult to calculate and easy to verify (i.e. an extremely low hash value).

The core premise, using blockchains to store terabytes of models and datasets, doesn't make any sense whatsoever.

The problems highlighted in this article however are valid, and it would be great to see something like IPFS for datasets and models.

jerpint2y ago

This was my reaction too. How do you “prove” my gradient is valid?

Well perhaps one way is you could have another LLM take a look at the data you are submitting and have it predict p(useful|not useful) , and create an incentive for users to generate authentic data

synctext2y ago

Indeed very half-baked.

riku_iki2y ago

> This was my reaction too. How do you “prove” my gradient is valid?

you can randomly ask sometimes 4 different nodes to calculate the same gradient, and see who is cheeting.

m00dy2y ago

Ian (Goodfellow), is that you ?

1 more reply

3cats-in-a-coat2y ago

The saying "blockchains are a solution in search of a problem" gets funnier every day.

tromp2y ago

> A good proof of work is difficult to calculate and easy to verify (i.e. an extremely low hash value).

That's a specific Proof-of-Work system known as Hashcash [1]. There are many others, such as Cuckoo Cycle, in which the solution is a fixed-length cycle in a random bipartite graph [2].

[1] https://en.wikipedia.org/wiki/Hashcash

[2] http://cryptorials.io/beyond-hashcash-proof-work-theres-mini...

arisAlexis2y ago· 5 in thread

Chaos engineering may sound cool and cyberpunky but it'd not what we need right now for AI safety. Unless you guys want to have your kids tortured by moloch

w4ffl352y ago

AI safety is a joke.

eddmakesio2y ago

A way for the humanities to get some of the CS STEM funding by writing sci-fi.

I believe it comes down to three groups:

AI researchers/organisations wanting to make their work sound very important/scary

Humanities researchers wanting STEM funding

AI organisations trying to bring in legislation to slow down competition

3 more replies

owlbite2y ago

jgalt2122y ago

Indeed. It makes sociology look like a rigorous discipline.

arisAlexis2y ago

1 more reply

bastawhiz2y ago· 4 in thread

> a cryptographically secure, decentralized ledger is the only solution to making AI safer.

m3kw92y ago

jasonmorton2y ago

> Checking the work takes the same resources as doing the work. > As much as this would be wonderful in a universe where it's possible, it's simply not possible.

We do live in that universe, under some currently believed assumptions. An NP-complete problem is an example of something where checking a solution is (thought to be) easier than finding one.

visarga2y ago

You can run the same gradient step on multiple nodes and only credit them when they agree.

bastawhiz2y ago

1 more reply

injeolmi_love2y ago· 1 in thread

If the problem is version control, why not use git? It’s also easy to fork, and it’s much cheaper than a blockchain.

carom2y ago

Git does not work well for binary blobs. Consider how you review a pull request.

23584522y ago· 1 in thread

Do you want Terminator? Because this is how you get Terminator. A blockchain-based distributed giant AI could go rogue and it would be very difficult to stop it.

jazzyjackson2y ago

Wouldn't it be reliant on the nodes of the blockchain continuing to run it?

And if the crypto-economy is doing well enough that node-runners are economically incentivized to continue running, then is the rogue AI even that harmful?

paulsutter2y ago

Blockchains are designed to avoid double spending in a low trust environment.

Double spending isn't one of the problems in AI. This article is absolute nonsense

> a cryptographically secure, decentralized ledger is the only solution to making AI safer

jasonmorton2y ago

One can cryptographically prove the correct inference of a small AI model now with https://github.com/zkonduit/ezkl (our open-source package).

api2y ago

Very important topic, but this is a bad article whose authors have their heads too far into cryptocurrency which IMHO is a mostly failed approach to decentralization.

The decentralized AI that is working is just sharing models. Splitting the inference across the net for cooperative execution is possible.

damc42y ago

kordlessagain2y ago

It's valuable to examine the challenges in machine learning without assuming decentralization as a solution:

> High Cost and Resource Requirements

> Data Privacy

Complying with data privacy regulations and ensuring that AI systems adhere to legal and ethical standards can be a challenge, especially in international contexts.

> Incentives

> Stale Data and Reproducibility

> Interoperability

The use of the Lightning Network combined with a proposed 402 response code is an interesting concept for addressing some of these challenges: https://github.com/lightninglabs/aperture

It could provide a decentralized and efficient way to facilitate payments for dataset access, training, and inferencing to incentivize data sharing and model usage.

nologic012y ago

> All of this may sound a little ridiculous but it’s not.

Actually it is. But in perverse way it isnt.

"AI" is currently being developed and commercialized in what is essentially an old-fashioned way, with a whiff of theft of intellectual property of the training material never too far.

amelius2y ago

> The academic literature is ridden with examples of state of the art (SOTA) models that weren’t reproducible

Probably too late, but Elsevier could fix its bad reputation if they required relevant academic papers to submit reproducible code and subjected them to automated testing.

1 more reply

byteware2y ago

CatWChainsaw2y ago

One buzzword just wasn't enough, was it...

tacone2y ago

Next up: the AIs of the world should unionize /s

j / k navigate · click thread line to collapse