An Interview with an Anonymous Data Scientist (2016) (opens in new tab)

gwern8y ago

Speaking as a 'loon', his AI history is wrong in several places:

1. the Fifth Generation Project (https://en.wikipedia.org/wiki/Fifth_generation_computer) was 1980s officially ending in 1992, not 'late 1990s' (during the Dot-com bubble?!); 2. the Lisp bubble didn't pop because of a failed DoD piloting project, it popped because of the first AI Winter + commodity SPARC/x86 pressure + recession (https://en.wikipedia.org/wiki/Lisp_machine) (and I don't recall DARPA instituting any policy like 'no AI', just stopping subsidizing Symbolics and later Connection Machine); 3. the Club of Rome report couldn't've killed its modeling language because it only really acquired its present ill repute by the 1990s, the implementation language Modelica (https://en.wikipedia.org/wiki/Modelica) didn't die (last release: April 2017) and is still in industrial use which is more than almost all languages from the 1960s-1970s can say, and even the World3 model (https://en.wikipedia.org/wiki/World3) analyzed in the report continued development for decades; 4. the Oxford paper (https://www.fhi.ox.ac.uk/wp-content/uploads/The-Future-of-Em...) doesn't make precise forecasts for when any automation may happen (merely saying "associated occupations are potentially automatable over some unspecified number of years, perhaps a decade or two"); 5. the GPU server comparison is really weird as computers have almost always cost more than humans and only relatively recently do any computers' hourly costs fall below minimum wage; and 6. the Dartmouth description is wrong, the conference merely proposed (http://www-formal.stanford.edu/jmc/history/dartmouth/dartmou...) that meaningful progress could be made by 10 researchers, not grad students ("We propose that a 2 month, 10 man study of artificial intelligence be carried out during the summer of 1956 at Dartmouth College...We think that a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer.")

Also, come on dude, Keras isn't hard to use - it's not even comparable to Tensorflow. But at least he didn't tell the tank story.

fnl8y ago

Here's another factual error: Data science is from the 1960s, and was used first in a paper published by Peter Naur in 1974: https://en.wikipedia.org/wiki/Data_science

MikeGale8y ago

And there's more where he's plain wrong, like Aluminium.

Despite all that a great antidote to the overhype that I see most days.

wolfgke8y ago

Luckily math has developed methods such as error-detecting/error-correcting codes (to insure against small typos/transmission errors), constructive results on continuity and robustness of functions (i.e. we can prove that if the error in the input data is less than some concretely computable delta, the solution will have an error less than epsilon; or we can ensure that the error in the solution is less than some computable epsilon if we can ensure that the error in the input data "is not too large" (i.e. bounded by some computable epsilon) etc.

In this sense I don't consider the question as that absurd.

fellellor8y ago

From what I know, error correcting codes wrap around information (in a manner of speaking) so as to provide a measure of consistency, which then enables error correction properties. If the information itself is riddled with errors then the error correcting code can't do anything here.

People using Babbage's machine would have entered raw information into that thing. No error correcting code would correct the human induced flaws in that. So the question was absurd at the time.

lou13068y ago

And yet the Schiaparelli lander crashed because the machine couldn't give the right answer to a question that was wrong.

All these solutions are good for a noisy input, but have no use when the input is incorrect (ie. doesn't match reality).

sriku8y ago

> You become so acutely aware of the limitations of what you’re doing that the interest just gets beaten out of you. You would never go and say, “Oh yeah, I know the secret to building human-level AI.”

A colleague of mine called these "educated incapacities" - where we become acutely aware of impossibilities and lose sight of possibilities. Andrej Karpathy, in one of his interviews iirc, said something like "if you ask folks in nonlinear optimization, they'll tell you that DL is not possible".

It is useful to keep that innocence alive despite being educated, especially if the cost to trying something out doesn't involve radical health risks. That plus a balance with scholarship.

Knowledge, courage and the means to execute are all needed.

rcpt8y ago

Right. I found that part of the article particularly irritating - there are tons of examples of researchers making substantial contributions outside of their primary field, cf https://mathoverflow.net/q/173268/6360

brucephillips8y ago

> If you ask folks in nonlinear optimization, they'll tell you that DL is not possible.

I sincerely doubt anyone who knows more than one sentence about deep learning would say that, since deep learning doesn't claim to optimize.

aoki8y ago

i suspect that what he's referring to is that he's heuristically minimizing a somewhat arbitrary (loss) function in a million-ish dimensions using the simple variants of gradient descent that work under these conditions. it sounds far too WIBNI to produce good results reliably (in practice, let alone in theory). the landscape has so many stationary points at which to get stuck; why would you ever get good results?

there's a small cottage industry of papers (like [0]) that try to explain this.

[0] https://arxiv.org/pdf/1412.0233.pdf

srean8y ago

You are right. Unfortunately, many (doubly unfortunately, even in academia, well, many who switched careers in optimization to ML) think that machine learning is just optimization.

Regarding deep NNs, one should be careful with what one wishes for, because sometimes they come true. Landing up with the global optimum of that thing would likely be the last thing one wants.

The key to deep NNs is to do such a pathetic job of optimizing the loss that the generalization is good. A problem is that there several different ways of doing a job poorly, not all of them would generalize well. When I have my engineer hat on, I would rather not have lots of indeterminism on my watch if I can afford it. Too dang hard to maintain correctness of.

On the other hand if one has a "with high probability" style result where the probabilities are high enough to be practically relevant, then we have something more workable.

mljoe8y ago

It happens when practitioners generalize theorems to scenarios that look similar but don't apply. The common pattern is misapplying an infinite set theorem to finite set case. If you don't know about the theorem in question to begin with, there is no way for you to misrepresent it.

roenxi8y ago

I think there is also a pretty pervasive over-estimation of how capable humans are.

As I see more of the failure modes of deep learning, a lot of successes and mistakes made by humans start to become more understandable. Machines don't need to be perfect or avoid failures; like humans, they need to work most of the time and then be used in systems that are tolerant of their potential faults and mistakes.

nocoder8y ago

I work at a tech company and one of the things I have recently noticed is how ML and AI terms are being increasingly used by the business people. The guys who have no technical understanding, these are accountants or marketing guys saying we should ask tech team to design ML to solve these problems. Its as if ML is a thing to through at every kind of imaginable problem and it will be magically solved. I believe a lot of this has to do with PR around this by big tech companies. Take for example, the recent alpha zero vs stock fish PR, it has been spun around by Google in a way as if it was some magic. You hear a lot about how it took just 4 hours and I find it hard to explain to people that 4 hour time is meaningless. It is about how many games it could play in that time. Moreover the match happened between two systems on a different hardware and that is a big difference and also the fact that it used a arbitrary type of time control of, 1 min/move. Again this can make big difference but it is a big struggle to get past this PR fluff. To be clear, I am not denying the advances made by deep mind, I just want people to understand that it has come on back of probably the the world best team of scientists alongside state of the art Google designed hardware and incredible monetary resources of Google.

blueplastic8y ago

I'm pretty sure you can throw IBM Watson's AI at any of these business problems and you can solve it very quickly.

trts8y ago

This articulated so much I have learned about the field in the past 5 years. As someone who inherited the title 'data scientist' because that's how my department designated us when it became fashionable, felt fraudulent due to the unlimited expectations of what data science is vs. what I understood it to be, and subsequently has interviewed probably nearly a hundred data science and machine learning 'experts', there seems to be little cohesion to what these terms describe, little understanding by laypersons about data science besides that it is some kind of magic that only the very gifted can command, and no greater distance between hubris and praxis that I have seen sustain itself for so long and so intensely.

The whole interview was an absolute joy to read.

carlsborg8y ago

It was 2016 and he said "I’ve noticed on AWS prices was that a few months ago, the spot prices on their GPU compute instances were $26 an hour for a four-GP machine, and $6.50 an hour for a one-GP machine. That’s the first time I’ve seen a computer that has human wages.."

Minimum wage (or thereabouts $7.20) now gets you a whopping p2.8xlarge (8 GPU, 32 vcpus, 488GB RAM), and the single GPU machine p2.xlarge is now $0.9 per hour.

This is a crazy data point. What will minimum wage buy you five years from now?

Depends, do you think the lowest legal wage should go up or down?

jononor8y ago

Even if I wanted it to double, I don't think that would make it more likely to actually happen. I think the likelihood of machine power available being double or quadruple what it is now is pretty good.

likelynew8y ago

g3.xlarge is many times faster and spot prices are like 0.5$ per hour.

CalChris8y ago

This reminds me of ... What’s the difference between a data scientist and a statistician? A data scientist lives in San Francisco.

gaius8y ago

A data scientist lives in San Francisco

That's completely over-simplifying matters. Data scientists also drink soy lattes and ride children's push scooters.

tikhonj8y ago

More cynically, the difference is 100k/year :P.

dllthomas8y ago

In rent? :D

vadimberman8y ago

Believe it or not, they are flooding Southeast Asia, too.

The data scientist optimizes ad clicks

cridal8y ago

data science? doing statistics on a mac...

sundarurfriend8y ago

It's an interesting read, though not very enlightening in terms of new information. It's same old pre-existing arguments put in a more informal, more directly honest package.

As another person who's seen robots fall over again and again and has a scope for the difficulty of the problem, I'd say there's also the risk of the day to day failures making us lose sight of the forest for the trees, with availability bias working against us.

Also,

> the Y Combinator autistic Stanford guy thing

> the Aspy worldview

It's a bit worrying that use of these terms has turned into a kind of slur, to lump a kind of imagined stunted-worldview with a medical diagnosis. Not particularly pissed that this guy used these, more worried about what it indicates - that these have become so common as to infiltrate friendly informal conversations from seemingly intelligent people.

muraiki8y ago

Yeah, I was shocked when I came across that. The data scientist appeared to be really in tune with ethical problems, and then speaks like that. It's very disappointing.

MikeGale8y ago

It is just so amazingly refreshing to read something not put together by a know-nothing.

I wish I saw more than one or two of these a year.

comstock8y ago

Any bets on when the current deep learning bubble is going to burst?

It’s shocking to me how much technical people buy into this, how “this time it’s different” and AI isn’t “over-promising and substantially under-delivering” this time. Really odd to watch it come round again, when the reality is we’re more likely to see some near incremental progresses, partly fueled by more compute and algorithmic advances. Partly by a lot of PR.

marshray8y ago

I think we're just used to computers advancing noticeably on a regular basis: "Is this year's iPhone better enough to justify an upgrade?"

Also, we judge the difficulty of things by our own experience. It took us ~1 billion years to get to the point where we could communicate abstract ideas and play chess. These were once believed to be the challenging problems in AI.

It turned out that chess is easy we're just relatively bad at it.

yters8y ago

Chess is easy when you have the hardware to effectively brute force it. Once someone develops an algorithm that requires an order of moves comparable to a human, and significantly outperforms a human, then AI will be interesting.

comstock8y ago

I find it somewhat understandable from non-tech people. I’m more surprised at now much people within the tech world but the hype.

brucephillips8y ago

The big tech companies are demonstrably using deep learning to solve previously unsolvable problems. It's a significant advance.

What's yet to be seen is if startups can profit from this advance, since it depends on massive data and compute.

comstock8y ago

AlphaGo is interesting. But what big new problems have been solved? (rather than incrementally improved).

4 more replies

eksemplar8y ago

It won't. It's going to change the way we do public sector. Not so much because the tech is revolutionary but because the upper echelons of society are sold on it and are actually putting it to use.

Technically you could do a lot of the decision making it'll be doing with human made models and a lot of data, but the machine is cheaper and it's backed by consulting agencies.

RPA was the first indication. It's basically screenscraping and small bots, stuff that's been around for a long time, I mean, it's basically what people use to bot in video games. Yet it's become a multimillion dollar industry over the course of a few years because it caught the right drift.

Like RPA, machine learning isn't just hype. It actually does some things with data really well, and when you couple that with the fact that ministers want this tech, well, that's all you need.

eanzenberg8y ago

Depends who you ask. If you talk to people knowledgeable about deep learning and its applicability they’ll say we’re in the productivity regime. If you’re asking people who aren’t knowledgeable then they will display their hype.

deviationblue8y ago

I've noticed an alarming uptick in articles around job titles and what people call themselves, so I feel compelled to say something. I couldn't be bothered what someone calls themselves as long as they can actually get shit done. The focus on titles is misplaced, especially for people who work in BigCo, as most titles in such places are handed down by HR anyways so I don't focus too much on them. But what does the person actually doing on a day to day basis? Is it stats? Is it exploratory analysis and modeling? Are they using ML, or working with data that doesn't fit on a single commodity machine? Writing people off based on what titles they might have had at some job (which they probably might not have any control over) is a good way to lose out on talent that you might have appreciated. But of course, this cuts both ways, would you want to work for someone who gets hung on things like that?

Anyway, overall great article, but this was the one thing that bothered me enough to comment.

nicolewhite8y ago

I enjoyed his comments on Tensorflow.

> It’s really bad to use. There’s so much hype around it, but the number of people who are actually using it to build real things that make a difference is probably very low.

I wonder how many data scientists out there are actually developing Tensorflow models for a mission-critical project at work. I'm not. I have used Tensorflow successfully within my personal projects, but I've yet to need it for anything "real."

mslate8y ago

We used it for a sales email classification problem--it significantly out-performed our conventional approaches (i.e. logistic regression + bag-of-words), but we were not PhDs and none of our job titles were "data scientist" so I guess that makes us charlatans ;)

That service offering among the rest of the business was marginal so it never became an offering that our sales team pitched our customers very aggressively, so in this particular case TensorFlow did not push the needle so-to-speak.

amrrs8y ago

Wondering what Tensorflow has to do with that out-performance since it must be all about the model/algorithm that you implemented in that - like you could've had a TF code running the same conventional approach you mentioned above - which wouldn't have done any magic. Isn't it the algorithm like a convnet doing the magic rather than TF itself responsible for it?

brucephillips8y ago

What TF model did you use?

hyperbovine8y ago

I'm currently using TF for a scientific algorithm that's completely unrelated to deep learning. The speedup over our previous solution is probably on the order of 1000x. There's nothing magical about Tensorflow, we were just too lazy/busy to dive deep on the legacy code, GPUify it, etc. Tensorflow let me do that in a couple of days. So, that's a win. OTOH I completely agree that the API and docs are completely inscrutable at times. Presumably Google is happy with it.

cosmic_ape8y ago

As other comments mention, if Tensorflow is seen for what it is - a framework for computation, rather than just "a deep learning thingy", it may be pretty useful.

It is probably quite far from a standard usage, but Tensorflow may be used to write some custom graphical models inference, for example. To be practical these algorithms can not be implemented in, say, pure python.

The point is Tensorflow gets you pretty close to assembly level computation. The alternative is to write in, say, cython - which is much more time consuming to write, and does not give you parallelization for free. Another alternative I guess would be torch, but that is the same as tensorflow the way I see it.

EdwardDiego8y ago

Can anyone comment on his point about Spark's ML libs? I note that was from last year (about 2015 code), not sure what level of beta they were at, but yeah, I use it for batch processing, but have never used the ML aspects, so just curious.

> And even up to last year, there’s just massive bugs in the machine learning libraries that come bundled with Spark. It’s so bizarre, because you go to Caltrain, and there’s a giant banner showing a cool-looking data scientist peering at computers in some cool ways, advertising Spark, which is a platform that in my day job I know is just barely usable at best, or at worst, actively misleading.

Radim8y ago

Getting better obviously, but the feet-on-the-ground experience for MLlib is still far from pleasant: hard to configure, hard to manage, hard to scale, hard to debug.

By way of anecdote, Spark's MLlib used to contain an implementation of word2vec that failed when used on more than 2 billion words (some arcane integer overflow). So much for scale!

As for performance, in 2016, the break-even point where a Spark cluster started being competitive with a single-machine implementation was around 12 Spark machines (a bit of a hindrance to rapid iterative development, which is the corner stone of R&D): https://radimrehurek.com/florence15.pdf

kwisatzh8y ago

Can you be more specific in terms of issues with ML Lib? I'm thinking of using it with Spark cause of big data requirements, but have heard MLLib in particular is highly unreliable.

blueplastic8y ago

lol, that PDF is referencing Spark 1.3 from March 2015 and to say that you need 12 modern Spark machines to break-even with one machine running a non-distributed ML framework is ridiculously wrong. And he wan Spark on EMR, which was pretty unoptimized back then.

Jesus_Jones8y ago

Hah, this is a great interview! [You can't really trust someone who calls themselves a data scientist, they are just taking that exciting and financially rewarding name], loosely paraphrasing. Too bad it is anonymous. It totally fits my unfair preconceptions of this field. I know, I'm a "computer scientist" with a phd, its not a real science if you have to put science in the name, that's what they tell me.

perturbation8y ago

I've been seeing nothing but negative, dismissive comments about data science on HN lately, which is really disappointing. There's definitely a lot of hype right now about DL, but almost all of my job does not deal with Big Data or Deep Learning, 'just' machine learning + stats + calc + scripting + data cleaning + deploying models.

I think most people don't have big data (Amazon has an x1 with 4 TB of RAM, after all!) but there's no shame in that. I'll use a big machine for grid search or other embarrassingly parallelizable stuff, but I can confirm that Spark is usually a bad tool for actual ML unless you use one of their out-of-the-box algos. Even then, tuning the cluster on EMR with YARN is a pain, especially for pyspark. There's a gap, I think, between the inflated expectations of "I'm going to get general AI in 5 years and CHANGE THE WORLD" and "this K-means clustering will be a good way to explore our reviews", but somewhere in the middle there is actual value.

(I also hate that "AI" is becoming the new hype-train; I don't consider anything of what I do to be "AI", but you have people calling CNNs or even non-deep-learning models "AI"). This is only going to result in inflated expectations- DS practitioners have to communicate the value without hype, and also find a way to weed out charlatans.

lemondrops8y ago

It's silly you're getting downvoted for this well-articulated and insightful comment.

I think much of the negativity towards DS from the programming community is because the Data Scientist is what the programmer used to be ~15 years ago. It's that nerdy thing for a select group of very smart people, whereas being a software developer/engineer/architect/whatever has become just another common job (at least outside of Silicon Valley).

Also, from my experience as the lone developer taking the first steps to implement machine learning techniques in my company - lots of developers also think DS/ML is a cool thing with value, but they simply, absolutely don't understand it (and don't want to put in the effort to learn). These techniques are not hard and not magic, but they require a completely different way to think about problems than "traditional" programming does. I've seen developers up and down the hierarchical ladder struggle with wrapping their heads around these concepts, and it's way easier to dismiss it all as "hype" instead of accepting the fact that these techniques will be a huge part of what software development will look like in the future.

gaius8y ago

But, all those things people did in the '90's or even earlier. It was called "data warehousing" or "decision support" back then. The fundamental techniques - linear regression, logistic regression, k-mean clustering - go back even earlier, to the OR community post-WW2. Banks have been doing credit scoring with these techniques for a loooong time. The manufacturing industry has been using these techniques for even longer. Engineering for even longer than that.

So you can see why people are quite cynical about the way old, established techniques are being presented as the hot new thing - and you can see why people who have been doing this stuff for 20+ years might be annoyed at 20-somethings who claim to have invented this new thing. What's wrong with someone calling themselves a "statistician" or an "applied mathematician"?

But this is by no means purely a DS thing, seems noone is a programmer anymore either, they're all "senior certified enterprise solution architects" or some grandiose thing.

perturbation8y ago

> But, all those things people did in the '90's or even earlier. It was called "data warehousing" or "decision support" back then.

I would say data warehousing is more concerned with things like OLAP, Star Schema, ETL, etc. than what people are calling 'data science' right now. The same thing with 'decision support', since data warehousing grew out of decision support systems. The most overlap here is with 'data mining' algorithms like association rules clustering.

> The fundamental techniques - linear regression, logistic regression, k-mean clustering - go back even earlier, to the OR community post-WW2.

Here I think you've got a stronger argument. OR has a long, proud history of using applied math for business objectives. But again, I would say most of OR deals with different problems and different techniques - it's more about prescriptive analytics, constrained optimization, linear programming, simulations, etc. than the type of predictive modeling in most data science.

I see data science as a separate field even though it's stitched together from a bunch of others. It's certainly not entirely new, and certainly overhyped in some annoyingly-breathless news reports. I could say the same thing about CS - was it entirely "new" when it started as a discipline? Isn't CS "just" applied math?

jnbiche8y ago

> seems noone is a programmer anymore either, they're all "senior certified enterprise solution architects"

To be fair, few of the "senior architects" I've worked with in big companies knew how to program very well.

cosmic_ape8y ago

I think their hype got even you a little bit. That is revealed by the word "even" in the phrase: 'people calling CNNs or even non-deep-learning models "AI"'...

perturbation8y ago

What I mean by this is - I don't see how anyone could reasonably call a Random Forest "AI" with a straight face, whereas someone could (wrongly, but understandably) call a CNN / RNN / etc. AI if only because it has the word "neural" in it.

There's two groups:

- People who are overly enthusiastic about neural nets

- People who are cynically calling every ML algorithm "AI", up to and including linear regression

and I'm more annoyed at the last one.

otalp8y ago

Jeff Hamerbacher, the guy who coined the term Data Science, also said "The best minds of my generation are thinking about how to make people click ads. That sucks.”

fnl8y ago

Um, no, that's yet another falsehood in that interview; The term DS is much older, and stems from Peter Naur, anecdotally coined in the 1960s and with a provable [edit: removed wrong ref] paper in 1974 using that term: https://en.wikipedia.org/wiki/Data_science

[1]: https://projecteuclid.org/download/pdf_1/euclid.aoms/1177704...

Interestingly, Tukey's (of fast Fourier fame) paper, "The future of Data Analysis" [1], was published circa 1961.

d--b8y ago

As important as it is to debunk the hype surrounding AI, it is also important to note that the recent advances in neural nets hinted that we're onto something regarding the functioning of the brain, and in my opinion, it would be equally foolish to dismiss the _possibility_ of a breakthrough that would get us much closer to general AI (for instance if someone came up with some kind of short-term / long-term memory mechanism that works well)

I personally think that the main reason why general AI may be very far away is because there is little incentive today for working on it. Specialized AI seemss good enough to drive cars. Specialized AI should be good enough to put objects in boxes, cut vegetables and flip burgers and so on, and the economical impact of building that is much greater than the economical impact of making a robot that barely passes the turing test and that's otherwise fairly dumb or ethically unbounded.

brucephillips8y ago

> the data sets have gotten large enough where you can start to consider variable interactions in a way that’s becoming increasingly predictive. And there are a number of problems where the actual individual variables themselves don’t have a lot of meaning, or they are kind of ambiguous, or they are only very weak signals. There’s information in the correlation structure of the variables that can be revealed, but only through really huge amounts of data

This isn't really true, since this can be said of any ML model. ML is nothing new. Deep learning is new. It works because we have so much data that we can start to extract complex, nonlinear patterns.

vadimberman8y ago

> I feel like the Hollywood version of invention is: Thomas Edison goes into a lab, and comes out with a light bulb. And what you’re describing is that there are breakthroughs that happen, either at a conceptual level or a technological level, that people don’t have the capacity to take full advantage of yet, but which are later layered onto new advances.

Brilliant.

ramtatatam8y ago

I'm not native English speaker and I find this sentence from the article weird:

> Because the frightening thing is that even if you remove those specific variables, if the signal is there, you're going to find correlates with it all the time, and you either need to have a regulator that says, “You can use these variables, you can't use these variables,” or, > I don't know, we need to change the law. As a data scientist I would prefer if that did not come out in the data. I think it's a question of how we deal with it. But I feel sensitive toward the machines, because we're telling them to optimize, and that's what they’re coming up with."

So is he saying that he is worried optimisation throws results that are not what he would like to see?

pesmhey8y ago

Race is an incredibly sensitive topic in America. The best analogy I can come up with for the author's statement is this:

You're looking to pick the fastest runners out of a group of people. You run an optimization algorithm to pick out the fastest in that group. Nothing about this optimization accounts for the fact that 1/3 of the people in the group have been being shot in the foot with a gun prior to your optimization. The data will show that they are poor runners without addressing the crime previously committed. In fact, many people would consider it a second act of crime.

yters8y ago

DL is hyped as a big thing, but why are multiple layers on a NN a breakthrough? The only breakthrough is hardware, but I don't see that hyped.

srean8y ago

Shh, will you. Some truths are not to be aired in public.

We know that no manager got fired for choosing Java.

There is a researcher's version of that. No researcher got fired for making a neural network more 'convoluted'. It helps if there exists one dataset where it does 0.3% better. Doesn't matter if that data set is(has been since the late 90s) standard fare as a homework problem in machine learning course.

That said we do understand these things a bit better than before. Some concrete math is indeed coming out.

bllguo8y ago

More layers allowed us to explore exponentially more network architectures. And if you look at a lot of advances in deep learning, particularly in convnets, the architecture is actually key - as important or more than the weights themselves. I guess another thing is that more layers give a disproportionate increase in performance. Some of it is hardware but there have definitely been advances in the theory; people aren't getting these new results from 10, 20 yr old networks that have been made larger.

kerbalspacepro8y ago

Am I the only one who was expecting to learn about data science and instead I got some moralising?

DrNuke8y ago

Different communities play a game at different times: the pioneers at first, then the early comers, then the businessmen, then the masses, in the end the legislators.

reesefitz8y ago

I feel so many data scientists are bullshit. I had the worse interviews, like someone telling me about how ARIMA is so good and why would I even use a LSTM network. Even worse is they cite some bullshit consulting article with skewed data to prove their point.

reesefitz8y ago

some interviews ask me the stupidest questions "how large is your dataset?" , "have you ever worked with 100GB of data". fucking morons

eanzenberg8y ago

Eh, pretty disappointing interview. It doesn’t tske a team to utilize gpu computing, it takes one person and I’ve done it. Also, you can’t complain about there being no strong-ai companies and then list accomplishments of strong-ai companies.

I personally don’t like the phrase data scientist but I get it and I get why it’s science as opposed to engineering. I personally like the split between machine learning, BI, and data engineering.

sjg0078y ago

I think the contrast is between statisticians and physicists PhDs compiling GPU support... even some CS PhDs have a hard time with that... this is less important as time goes on since the engineers figure it out and make it readily available.

When I installed Theano, it was just `pip install theano`, and editing a couple of lines in a config file. Are other GPU libs (tensorflow, caffe, etc.) really that much more difficult?

j / k navigate · click thread line to collapse

103 comments

Terr_8y ago

Good interview, there are a bunch of bits I feel like I ought to be Quoting For Truth but then I'd end up with a pretty bloated reply.

Reminds me of a pre-transistor computing quote from Charles Babbage, about some overeager British politicians:

dalbasal8y ago

For some devils advocacy...

Obviously I don't know the answer and this whole comment is based on an anecdote that may not even be true. Still, I don't discount the possibility that the unwashed masses are right.

^just kidding

steamer258y ago

This reminds me of the inspiration for the name of Taleb's "green lumber fallacy". From Wikipedia:

sp5278y ago

newfoundglory8y ago

What is this supposed to mean?

gwern8y ago

Speaking as a 'loon', his AI history is wrong in several places:

Also, come on dude, Keras isn't hard to use - it's not even comparable to Tensorflow. But at least he didn't tell the tank story.

fnl8y ago

Here's another factual error: Data science is from the 1960s, and was used first in a paper published by Peter Naur in 1974: https://en.wikipedia.org/wiki/Data_science

MikeGale8y ago

And there's more where he's plain wrong, like Aluminium.

Despite all that a great antidote to the overhype that I see most days.

wolfgke8y ago

In this sense I don't consider the question as that absurd.

fellellor8y ago

People using Babbage's machine would have entered raw information into that thing. No error correcting code would correct the human induced flaws in that. So the question was absurd at the time.

lou13068y ago

And yet the Schiaparelli lander crashed because the machine couldn't give the right answer to a question that was wrong.

All these solutions are good for a noisy input, but have no use when the input is incorrect (ie. doesn't match reality).

sriku8y ago

It is useful to keep that innocence alive despite being educated, especially if the cost to trying something out doesn't involve radical health risks. That plus a balance with scholarship.

Knowledge, courage and the means to execute are all needed.

rcpt8y ago

brucephillips8y ago

> If you ask folks in nonlinear optimization, they'll tell you that DL is not possible.

I sincerely doubt anyone who knows more than one sentence about deep learning would say that, since deep learning doesn't claim to optimize.

aoki8y ago

there's a small cottage industry of papers (like [0]) that try to explain this.

[0] https://arxiv.org/pdf/1412.0233.pdf

srean8y ago

You are right. Unfortunately, many (doubly unfortunately, even in academia, well, many who switched careers in optimization to ML) think that machine learning is just optimization.

Regarding deep NNs, one should be careful with what one wishes for, because sometimes they come true. Landing up with the global optimum of that thing would likely be the last thing one wants.

On the other hand if one has a "with high probability" style result where the probabilities are high enough to be practically relevant, then we have something more workable.

mljoe8y ago

roenxi8y ago

I think there is also a pretty pervasive over-estimation of how capable humans are.

nocoder8y ago

blueplastic8y ago

I'm pretty sure you can throw IBM Watson's AI at any of these business problems and you can solve it very quickly.

trts8y ago

The whole interview was an absolute joy to read.

carlsborg8y ago

Minimum wage (or thereabouts $7.20) now gets you a whopping p2.8xlarge (8 GPU, 32 vcpus, 488GB RAM), and the single GPU machine p2.xlarge is now $0.9 per hour.

This is a crazy data point. What will minimum wage buy you five years from now?

Depends, do you think the lowest legal wage should go up or down?

jononor8y ago

likelynew8y ago

g3.xlarge is many times faster and spot prices are like 0.5$ per hour.

CalChris8y ago

This reminds me of ... What’s the difference between a data scientist and a statistician? A data scientist lives in San Francisco.

gaius8y ago

A data scientist lives in San Francisco

That's completely over-simplifying matters. Data scientists also drink soy lattes and ride children's push scooters.

tikhonj8y ago

More cynically, the difference is 100k/year :P.

dllthomas8y ago

In rent? :D

vadimberman8y ago

Believe it or not, they are flooding Southeast Asia, too.

The data scientist optimizes ad clicks

cridal8y ago

data science? doing statistics on a mac...

sundarurfriend8y ago

It's an interesting read, though not very enlightening in terms of new information. It's same old pre-existing arguments put in a more informal, more directly honest package.

Also,

> the Y Combinator autistic Stanford guy thing

> the Aspy worldview

muraiki8y ago

Yeah, I was shocked when I came across that. The data scientist appeared to be really in tune with ethical problems, and then speaks like that. It's very disappointing.

MikeGale8y ago

It is just so amazingly refreshing to read something not put together by a know-nothing.

I wish I saw more than one or two of these a year.

comstock8y ago

Any bets on when the current deep learning bubble is going to burst?

marshray8y ago

I think we're just used to computers advancing noticeably on a regular basis: "Is this year's iPhone better enough to justify an upgrade?"

It turned out that chess is easy we're just relatively bad at it.

yters8y ago

comstock8y ago

I find it somewhat understandable from non-tech people. I’m more surprised at now much people within the tech world but the hype.

brucephillips8y ago

The big tech companies are demonstrably using deep learning to solve previously unsolvable problems. It's a significant advance.

What's yet to be seen is if startups can profit from this advance, since it depends on massive data and compute.

comstock8y ago

AlphaGo is interesting. But what big new problems have been solved? (rather than incrementally improved).

4 more replies

eksemplar8y ago

It won't. It's going to change the way we do public sector. Not so much because the tech is revolutionary but because the upper echelons of society are sold on it and are actually putting it to use.

Technically you could do a lot of the decision making it'll be doing with human made models and a lot of data, but the machine is cheaper and it's backed by consulting agencies.

Like RPA, machine learning isn't just hype. It actually does some things with data really well, and when you couple that with the fact that ministers want this tech, well, that's all you need.

eanzenberg8y ago

deviationblue8y ago

Anyway, overall great article, but this was the one thing that bothered me enough to comment.

nicolewhite8y ago

I enjoyed his comments on Tensorflow.

> It’s really bad to use. There’s so much hype around it, but the number of people who are actually using it to build real things that make a difference is probably very low.

mslate8y ago

amrrs8y ago

brucephillips8y ago

What TF model did you use?

hyperbovine8y ago

cosmic_ape8y ago

As other comments mention, if Tensorflow is seen for what it is - a framework for computation, rather than just "a deep learning thingy", it may be pretty useful.

EdwardDiego8y ago

Radim8y ago

Getting better obviously, but the feet-on-the-ground experience for MLlib is still far from pleasant: hard to configure, hard to manage, hard to scale, hard to debug.

By way of anecdote, Spark's MLlib used to contain an implementation of word2vec that failed when used on more than 2 billion words (some arcane integer overflow). So much for scale!

kwisatzh8y ago

Can you be more specific in terms of issues with ML Lib? I'm thinking of using it with Spark cause of big data requirements, but have heard MLLib in particular is highly unreliable.

blueplastic8y ago

Jesus_Jones8y ago

perturbation8y ago

lemondrops8y ago

It's silly you're getting downvoted for this well-articulated and insightful comment.

gaius8y ago

But this is by no means purely a DS thing, seems noone is a programmer anymore either, they're all "senior certified enterprise solution architects" or some grandiose thing.

perturbation8y ago

> But, all those things people did in the '90's or even earlier. It was called "data warehousing" or "decision support" back then.

> The fundamental techniques - linear regression, logistic regression, k-mean clustering - go back even earlier, to the OR community post-WW2.

jnbiche8y ago

> seems noone is a programmer anymore either, they're all "senior certified enterprise solution architects"

To be fair, few of the "senior architects" I've worked with in big companies knew how to program very well.

cosmic_ape8y ago

I think their hype got even you a little bit. That is revealed by the word "even" in the phrase: 'people calling CNNs or even non-deep-learning models "AI"'...

perturbation8y ago

There's two groups:

- People who are overly enthusiastic about neural nets

- People who are cynically calling every ML algorithm "AI", up to and including linear regression

and I'm more annoyed at the last one.

otalp8y ago

Jeff Hamerbacher, the guy who coined the term Data Science, also said "The best minds of my generation are thinking about how to make people click ads. That sucks.”

fnl8y ago

[1]: https://projecteuclid.org/download/pdf_1/euclid.aoms/1177704...

Interestingly, Tukey's (of fast Fourier fame) paper, "The future of Data Analysis" [1], was published circa 1961.

d--b8y ago

brucephillips8y ago

vadimberman8y ago

Brilliant.

ramtatatam8y ago

I'm not native English speaker and I find this sentence from the article weird:

So is he saying that he is worried optimisation throws results that are not what he would like to see?

pesmhey8y ago

Race is an incredibly sensitive topic in America. The best analogy I can come up with for the author's statement is this:

yters8y ago

DL is hyped as a big thing, but why are multiple layers on a NN a breakthrough? The only breakthrough is hardware, but I don't see that hyped.

srean8y ago

Shh, will you. Some truths are not to be aired in public.

We know that no manager got fired for choosing Java.

That said we do understand these things a bit better than before. Some concrete math is indeed coming out.

bllguo8y ago

kerbalspacepro8y ago

Am I the only one who was expecting to learn about data science and instead I got some moralising?

DrNuke8y ago

Different communities play a game at different times: the pioneers at first, then the early comers, then the businessmen, then the masses, in the end the legislators.

reesefitz8y ago

some interviews ask me the stupidest questions "how large is your dataset?" , "have you ever worked with 100GB of data". fucking morons

eanzenberg8y ago

sjg0078y ago

When I installed Theano, it was just `pip install theano`, and editing a couple of lines in a config file. Are other GPU libs (tensorflow, caffe, etc.) really that much more difficult?