Boltzmann Encoded Adversarial Machines (opens in new tab)

(arxiv.org)

205 pointsdavelope9118y ago23 comments

23 comments

18 comments · 7 top-level

cs7028y ago· 4 in thread

Very interesting.

At a high level (ignoring many details) the main idea is to replace generator networks in GANs with Restricted Boltzman Machines, or RBMs, which are easier to train (more stable). The authors call this kind of architecture "Boltzmann Encoded Adversarial Machines," or BEAM for short.

The experiments provide persuasive evidence that BEAMs outperform GANs. Figure 3, in particular, I find very persuasive -- it compares the ability of different architectures to learn to generate low-dimensional mixtures of Gaussians, with BEAMs very clearly outperforming GANs. The results in higher-dimensional applications such as image generation also suggest that BEAMs outperform GANs, but the improvement is somewhat more subjective due to the nature of high-dimensional data. Obviously, these results need to be replicated by others.

It looks promising to me. That said, it's been years since I've touched an RBM -- I only have a vague recollection of how they work and how they're trained, layer by layer, as proposed by Hinton in 2006 or so. Time to re-read old papers!

drams8y ago

To clarify: in the case of a BEAM both the generator and all but the top layer of the discriminator is replaced with an RBM. The adversary in this case operates on features encoded by the RBM, not raw data samples. Secondly the RBM is trained with a combined loss involving log-likelihood and the adversarial term.

cs7028y ago

Yes. For simplicity's and brevity's sake, I ignored many important details in my summary.

1 more reply

cycrutchfield8y ago

I am not entirely convinced. In particular, the results shown in Fig 7 remind me of the BEGAN paper which was similarly hyped. But I'll defer further judgment until I read through it more and maybe run some experiments.

DanWaterworth8y ago

There's a good reason that the pictures look similar. Both architectures produce somewhat blurry images.

The problem, in BEGAN's case, is that when your idea of similarity is based of mean squared error, high frequency details are just not important. [1] You can see this by doing PCA on natural image patches. BEGAN uses an autoencoder trained on MSE.

RBMs produce blurry images because the architecture is not good at representing multiplicative interactions. You just get splodges of colour.

[1] http://danielwaterworth.com/posts/what's-wrong-with-autoenco...

w-m8y ago· 3 in thread

It's amazing to see deep learning blast through all the benchmarks, for example in computer vision, over the last couple of years. At the same time something starts to feel off about having all these single-use asymmetric feedforward networks solving their own little task. Being trained in one direction, then used in the other, then thrown away. Maybe being chained together for a more complex task, but that seems to be about it for the average (real-world application) use case of deep learning nets.

I'm sure there's plenty of interesting work being done in ML to improve on this situation and come up with new architectures. Yet I was moderately surprised when I rediscovered Boltzmann machines recently, and found not much work seemed to be going on there at all (very little at NIPS 2017 for example?).

This BEAM seems intriguing, here's hoping it opens the door to a better understanding and modeling of our world.

nafizh8y ago

RBMs went out of fashion after 2010-2011, as other architectures worked better than them in almost all of the tasks in vision.

rm_-rf_slash8y ago

I have had a similar thought recently. The office I work at has a large e-recycling bin for old computers. I have recovered quite a few desktops, laptops, and monitors, as well as a bunch of tidbits like adapters and RAM.

A lot of the RAM, for instance, is DDR2 and usually a measly 1gb apiece. They take up the exact same amount of space as RAM with 4gb apiece or more. I don’t know entirely why I still have them. Now that I’m doing physical computing/IoT development, Im seeing how pointless it is to have a bunch of desktops/laptops when I can get much more done - conveniently I might add - with a teeny tiny RedBear microcontroller.

I think an inherent feature of technology is having to get used to the idea that things age and die much faster than other products. Whether that’s physical hardware or trained neural networks, there comes a point when we just have to let go.

radarsat18y ago

I found this one pretty interesting in that regard. The basic gist is that learning the function that projects onto the boundary of the dataset is useful for a variety of (linear inverse) problems.

One Network to Solve Them All --- Solving Linear Inverse Problems using Deep Projection Models https://arxiv.org/abs/1703.09912

rememberlenny8y ago· 2 in thread

Can someone explain the basic implications of this against current GANs and also provide a practical ML application?

drams8y ago

I can try. (I am a coauthor of this paper) First off, Unlearn.ai is a startup working to build new tools that make precision medicine a reality. We needed to be able to build generative models which allow us to 1. model multimodal data easily (consider medical datasets with categorical data, binary, and continuous, with various bounds etc. all mixed together) 2. be able to answer counterfactual questions about data (for example if I down regulate a gene how does this effect the rest of the gene expression?) 3. be able to build models which handle time-series data (give me a likely progression of this person's cognitive scores given their current scores and other indicators)

RBMs are natural candidates for models which handle these kind of issues quite well. 1. Although people have done work trying to get GANs to work well with multimodal data, it's pretty kludgy. 2. GANs do not provide a means of inference (contrast VAEs which can satisfy this demand). 3. We have built a solid extension of RBMs to temporal models which work quite well.

However, as explained in this paper, stock RBMs have significant training issues. This paper attempts to improve the situation.

tlarkworthy8y ago

RBMs have a native probabilistic output (the output is a distribution you can slice), but vanilla neural networks don't (the output is a vector). Is that right?

1 more reply

babak_ap8y ago· 2 in thread

Is there a reference, open source, implementation available? (on Github or similar)

TheAnig8y ago

I too was interested in this

drams8y ago

See the comment above.

MarkMMullin8y ago

I'm wondering if the work on adversarial systems, this one being quite interesting, can help us with our giant bugaboo of "OMG, its overfitted :-(" Right now we model, train, test, fail, and start all over again, and usually fiddle with the hyperparameters to boot - what would happen if we turned training into a two phased approach, with a BEAM/GAN whatnot used on each cycle to measure how 'brittle' the backprop is? The idea being to round down the spikes in the learned model by penalizing the backprop when it is too narrow - training would take longer, but we'd throw away fewer sets, I'd think

bra-ket8y ago

can this be applied to sequence learning?

johnfactorial8y ago

Just what the AI/ML crowd needs in the midst of burgeoning fear of AI: a new technology with "Adversarial Machines" right there in the name.

j / k navigate · click thread line to collapse

23 comments

18 comments · 7 top-level

cs7028y ago· 4 in thread

Very interesting.

drams8y ago

cs7028y ago

Yes. For simplicity's and brevity's sake, I ignored many important details in my summary.

1 more reply

cycrutchfield8y ago

DanWaterworth8y ago

There's a good reason that the pictures look similar. Both architectures produce somewhat blurry images.

RBMs produce blurry images because the architecture is not good at representing multiplicative interactions. You just get splodges of colour.

[1] http://danielwaterworth.com/posts/what's-wrong-with-autoenco...

w-m8y ago· 3 in thread

This BEAM seems intriguing, here's hoping it opens the door to a better understanding and modeling of our world.

nafizh8y ago

RBMs went out of fashion after 2010-2011, as other architectures worked better than them in almost all of the tasks in vision.

rm_-rf_slash8y ago

radarsat18y ago

I found this one pretty interesting in that regard. The basic gist is that learning the function that projects onto the boundary of the dataset is useful for a variety of (linear inverse) problems.

One Network to Solve Them All --- Solving Linear Inverse Problems using Deep Projection Models https://arxiv.org/abs/1703.09912

rememberlenny8y ago· 2 in thread

Can someone explain the basic implications of this against current GANs and also provide a practical ML application?

drams8y ago

However, as explained in this paper, stock RBMs have significant training issues. This paper attempts to improve the situation.

tlarkworthy8y ago

RBMs have a native probabilistic output (the output is a distribution you can slice), but vanilla neural networks don't (the output is a vector). Is that right?

1 more reply

babak_ap8y ago· 2 in thread

Is there a reference, open source, implementation available? (on Github or similar)

TheAnig8y ago

I too was interested in this

drams8y ago

See the comment above.

MarkMMullin8y ago

bra-ket8y ago

can this be applied to sequence learning?

johnfactorial8y ago

Just what the AI/ML crowd needs in the midst of burgeoning fear of AI: a new technology with "Adversarial Machines" right there in the name.

j / k navigate · click thread line to collapse