Andrew Ng: Unbiggen AI (opens in new tab)

(spectrum.ieee.org)

209 pointssbehere4y ago84 comments

84 comments

58 comments · 13 top-level

notsag-hn4y ago· 12 in thread

I was going to interview at LandingAI. I was asked before the interview to install a spyware browser extension to monitor my traffic to detect if I was cheating during the interview. I respectfully declined and didn't have that interview.

kevsim4y ago

Wow if you can “cheat” during an interview - meaning either that they’re asking trivial, google-able stuff or that they’re so bad at interviewing that they can’t tell if you actually know your stuff - then their hiring process is pretty bad.

tablespoon4y ago

> Wow if you can “cheat” during an interview - meaning either that they’re asking trivial, google-able stuff or that they’re so bad at interviewing that they can’t tell if you actually know your stuff - then their hiring process is pretty bad.

Not necessarily, at least on the first point. Someone could be getting coached.

A few years ago, a coworker of mine hired a contractor onto his team and was convinced the person who actually showed up was not the person who he interviewed (over the phone). He also thought the guy who did show up was getting a lot of help day-to-day from somewhere. The guy was a contractor, so it wasn't a huge problem because we could drop him quickly, but I would have never expected someone would do anything like that. However, it kind of makes sense as a scam: be a decent developer, get a stable of unhirable incompetents, and rotate them through companies while taking a cut of their salary.

2 more replies

strikelaserclaw4y ago

i mean, people are good at finding clever ways to cheat.

cloogshicer4y ago

Well, Ng is also one of those people who believe that we should all work 70+ hours per week:

https://news.ycombinator.com/item?id=15251769

mdp20214y ago

~80hrs on topic A squeezes the available time for being acquainted with the rest. [Edited because there was little way not to make the former formulation read, unwillingly, nasty]

Some of us believe instead on the advantage of being a polymath, (also) to be able to export wisdom from other contexts into the current work.

Also in terms of the proper ground to facilitate innovation.

1 more reply

nomilk4y ago

Musk recommends 80-100hr weeks, every week

Source: https://www.youtube.com/watch?v=GtaxU6DZvLs&t=1m20s

1 more reply

weego4y ago

It's literally our job to not just assume the possible solution that rolls off the top of our heads might not be the most up to date / best practice and to research it

mirntyfirty4y ago

Agreed. A decent interviewer can also determine a person’s understanding of a topic by simply talking to them about it. IE why did you build a model like this? What diagnostics did you use? Have you tried ____ before in your career?

mdp20214y ago

I'd just note that if pushed by circumstances (if one was willing to be interviewed in spite of their ways), the interview environment could be (would be) on a throwaway virtual machine...

Possibility which, by the way, makes the interviewer's cautionary move generally useless.

ThalesX4y ago

Or, in 2022, one could reach into their pocket just use a phone, making the interviewer's cautionary move generally useless.

1 more reply

kevinventullo4y ago

I suppose if you’re clever enough to set up a VM in order to evade detection, that’s a pretty positive aptitude signal in its own right (though pretty negative on the behavioral/ethics side).

1 more reply

tromp4y ago

Missed opportunity to say you landed an interview at LandingAI :-)

itissid4y ago· 8 in thread

That is the problem with generalization and cop outs like these. It's no good to people in the field doing actual work where the devil is in the detail.

Big data is fairly important to a lot of things, for example I was listening to Tesla's use of Deep net models where they mentioned that there were literally so many variations of Stop Signs that they needed to learn what was really in the "tail" of the distribution of Stop Sign types to construct reliable AI

vasco4y ago

Interestingly, when you learn how to drive you need to see approximately one example and you're able to identify them all.

teruakohatu4y ago

That is called transfer learning. You might only need to see one photo of a sign to identify it in real life (although arguably learner drivers take a while to notice signs) but that is only because you have been training on identifying generic objects since you left the womb.

You brain already knows how to select the most important features of a sign. The shape, the size and the color. You have also learned how to understand the text on the sign.

A new born baby does not have that ability.

This is applied in ANN as well. Transfer learning is using a pre-trained neural network, which has already learned identifying objects, and then using it to train on identifying a new, usually smaller, set of objects using, usually, a lot less training data. That is what Andrew is talking about in the article.

3 more replies

corndoge4y ago

Is there some underlying point to this statement? It comes off as a passive dismissal of something but I'm not sure what. It might be helpful to directly state what you're trying to say so that other people can engage with it.

1 more reply

itissid4y ago

It does feel though that a model like the human mind will be very fundamentally different from any of the models of today. No?

Like the NN State of the art models of today are so different from state of the art 12 or so years ago which was SVMs.

simulate-me4y ago

Your brain is also the result of billions of years of evolutionary "training." Neural nets start from scratch.

onethought4y ago

when you learn to drive you need [approximately 16 years training your visual system hardware that took 6 million years to evolve and] to see approximately one example and you're able to identify them all.

FTFY.

Yet Tesla have been working on both the hardware and software for 10 years? Amazing progress right?

2 more replies

IIAOPSW4y ago

Sounds like they missed the forest and instead "deep learned" all the variations of trees.

bennyg4y ago

Do you have a link for that Tesla talk?

whatever14y ago· 6 in thread

My understanding is that they are trying to automate the data preparation steps that seasoned ML practitioners are doing anyway today.

The fact that he tries this in manufacturing makes the case stronger. In most manufacturing companies you do not have access to top ML talent.

You have Greg who knows python and recently visualized some production metrics.

If we could empower Greg with automated ML libraries that guide him in the data preparation steps in combination with precooked networks like autogluon, then manufacturing could become a huge beneficiary of the ML revolution.

overkalix4y ago

Greg probably also knows SAS and AMPL, and has a good knowledge of ops research, which is within stone-tossing distance of whatever ML is pretending to be this week.

NumberCruncher4y ago

After 15 years of experience with SAS this sounds to me like saying "knowing how to write and having a pen makes you to a poet". But it depends on how far you can toss a stone...

whatever14y ago

OR and ML have their own space in manufacturing.

OR is perfect when you can describe explicitly what the decision space is and what the restrictions are.

ML is great fit when you want to identify and use patterns. Quality control with machine vision is a good application for ML. NLP for PDF documents is a huge field for manufacturing as well. Companies have so much data in email attachments that they do not currently take advantage of.

1 more reply

andrewf4y ago

A tangent, if you have time: where would I go for a primer on operations research and/or discrete event simulation?

My thought is that Goldratt's "The Goal" / theory of constraints is a useful way of thinking about optimizing throughput in a computer system. http://www.qdpma.com/Arch_files/RWT_Nehalem-5.gif plus an instruction latency table is something like a well modeled factory. (The Phoenix Project applies these principles to project management, which I think is a somewhat less useful analogy!)

I'm curious about applying existing tools to modeling things like: how will this multi-tiered application behave when it gets a thundering herd of requests? What if I tweak these timeouts, adjust this queue, make a particular system process requests on a last-in-first-out basis? Can I get a pretty visualization of what would happen?

tuxguy4y ago

lol-ing at "Whatever ml is pretending to be this week"

so funny, because so accurate :)

GabeIsko4y ago

Visual inspection in manufacturing is a very solved problem, especially in the AI field. The big bucks are in pattern matching anyway... it's a dumb company.

TOMDM4y ago· 6 in thread

Yeah that'd be great.

I also want cars that run on salt water.

I'm not saying that small data ai is equally impossible, but simply saying "we should make this better thing" isn't enough.

Datenstrom4y ago

> simply saying "we should make this better thing" isn't enough.

Besides the references to his company which has customers and a product that already works on these principles the literature currently shows that this is very much possible if you dig into the correct niches. Besides the SOTA in few-shot and meta-learning it is possible to smartly choose the correct few samples for the network that yield the same results.

It has also been my primary focus for the past 5 years and the core of the company I founded.

riku_iki4y ago

> it is possible to smartly choose the correct few samples for the network that yield the same results.

And then, someone is using pretrained 500B model, and fine-tuning your few examples, and getting new SOTA.

2 more replies

sanxiyn4y ago

It's more of "this direction seems higher ROI than that direction", in particular quality vs quantity of data.

Already in 2018 SenseTime reported that for face recognition, clean dataset surpasses accuracy of 4x larger raw dataset.

https://arxiv.org/abs/1807.11649

mdp20214y ago

«Small data /ai/» is not "impossible", it is actually necessary: AI, opposed to this ML, implies perfectioned digestion of the input data.

Only, the article seemed to show a very conservative Ng about the algorithms, a focus on data management - so it's still ML.

technocratius4y ago

I would say that Andrew Ng has some credibility in putting practice to his preaching.

atulsnj4y ago

Atleast someone's working on it.

a_square_peg4y ago· 4 in thread

I’ve been wondering about the limits of data-centric approach – there seems to be this implicit notion that more data equals better performing ML or AI. I think it would be interesting to imagine a point of diminishing return on additional data if we consider that our ability to perceive is probably largely based on two parts - sensory input and knowledge. Note that I’m making an explicit distinction here on the difference between data and knowledge.

For instance, an English speaker and a non-English speaker may listen to someone speaking English and while the auditory signals received by both are the same, the meaning of the speech will only be perceived by the English speaker. When we’re learning a new language, it’s this ‘knowledge’ aspect that we’re enhancing in our brain, however that is encoded.

This knowledge part is what allows us to see what’s not there but should be (e.g. the curious incident of the dog in the night) and when the data is inconsistent (e.g. all the nuclear close calls). I’m really not sure how this ‘knowledge’ part will be approached by the AI community but feel like we’re already close to having squeezed out as much as we can from just the data side of things.

Somewhat related, we have a saying in Korean – ‘you see as much as you know’.

mdp20214y ago

> more data equals

It does in general, but what is elaborated and how? Structuring patterns is not the same as "knowledge" (there are missing subsystems), and that fed data is not fed efficiently, with ideal efficiency - compare with the realm in which "told one notion you acquire it" (this while CS is one of the disciplines focusing on optimization, so it would be a crucial point).

machiaweliczny4y ago

I have a feeling that too much knowledge might slow learning process as it's harder to spot/test observe steepest gradient. At least that's how it feels intuitively from human PoC. From computation that would be just little more computation but I guess would mean slower convergence also. Taking math as more extreme example it's hard to understand something complex unless you understand basic algebra.

Anyone knows if this might be true mathematically speaking? Does order of data matters?

Longwelwind4y ago

Can't you consider that knowledge is a function of previous data? In your example, the 2 individuals actually didn't receive the same amount of data because the English speakers received data previously that allowed him to build some kind of "knowledge" that allows him to solve specific related tasks (understanding a spoken sentence). This would be the equivalent of transfer learning where "knowledge" is a model trained on previous, more general, data.

mjburgess4y ago

Nope, it's never a function of data -- because data is always ambiguous. It is never possible just to infer the conceptual model of the data from the data alone.

Animals solve this problem by having bodies and moving around. It is that we take the bent stick out of the water which allows us to impart a theory to the "data" we receive... a theory implicit in our actions.

Since we are causally active in the world, sequenced in time, and directly changing it -- our bodies enable us to resolve this problem. The motor system is the heart of intelligence, not the frontal lobe -- which is merely book-keeping and accounting for what our bodies are doing.

1 more reply

DeathArrow4y ago· 3 in thread

Pretty interesting. Mr. Ng claims that for some applications having a small set of quality data can be as good as using huge set of noisy data.

I wonder if, assuming the data is of highest quality, with minimal noise, having more data will matter for training or not. And if it matters, on what degree?

frozenport4y ago

This is at the heart of the ML training problem.

In general you want to add more variants of data but not so much that the network doesn't get trained by them. Typical practice is to find images whose inclusion causes high variation in final accuracy (under k-fold validation, aka removing/adding the image causes a big difference) and prefer more of those.

Now, why not simply add everything? Well in general it takes too long to train.

pbowyer4y ago

> Typical practice is to find images whose inclusion causes high variation in final accuracy (under k-fold validation, aka removing/adding the image causes a big difference)

How do you identify these images? It sounds like I'd need to build small models to see the variance but I'm hoping that there's a more scientific way?

kavalg4y ago

It is relatively easy to turn small and accurate data to bigger and less accurate data with various forms of augmentation. The opposite is harder.

aj74y ago· 2 in thread

“I once built a face recognition system using 350 million images.”

Did this make any of you a little queasy?

mdp20214y ago

Well noted! Explicitly: where does such database come from?

mkl4y ago

Frames of video could make the number sky-high like that without involving enormous numbers of people.

1 more reply

leobg4y ago· 2 in thread

What are some ML data annotation tools that guide you towards those data points where the model gets confused? I hear Prodigy does this. Any others?

jstx14y ago

What's the role of these tools? Can't a developer just write the code to get those data points?

At a first glance it seems like the hassle of integrating such a product into an existing ML codebase/pipeline is larger than solving the problem by hand.

leobg4y ago

What I mean is an annotation tool that interacts with the model itself in such a way that it will present to the user exactly those training examples next that will have the greatest impact in helping the model learn. So an annotation tool that provides a user interface for annotating data quickly (with keyboard shortcuts etc.). And looped into inference through the model to be trained, so you always get presented with the very training example that, out of the ones available, the model currently would be most unsure about.

atbpaca4y ago· 1 in thread

Glad to see the term ML being used more often than AI in the comments as it looks like most "AI" models are trained for image classification. Having said that, the idea of "doing more with less" sounds interesting and I wonder what it means exactly. Does it mean taking a dataset of 50 images and to create 1000s of synthetic images from it?

spupe4y ago

Yeah I was very interested about that point in particular. I think synthetic data is one of the ideas, but I got the sense that he also means helping to identify what makes a data set good, even if small. It looks like Andrew Ng is developing a platform for automatically detecting whether a dataset is suitable and, if not, what are the steps to improve it. A sort of automated ML consultant, allowing you to sell capabilities much cheaper than if you needed to consult an actual expert.

tacosbane4y ago· 1 in thread

can we build an AI to detect that the AI goalposts keep getting moved?

girvo4y ago

A simple “return true;” should suffice, but to be honest that’s what makes the field fascinating to me as an outsider

a-dub4y ago

data quality is important. every ai project i've worked on has started with visualizing the data and thinking about it.

it's easy to get complacent and focus on building big datasets. in practice, looking at the data often reveals issues sometimes in data quality and sometimes scope of what's in there (if you're missing key examples, it's simply not going to work).

most ml is actually data engineering.

xiphias24y ago

I can imagine that customizing AI solutions in an automated way is quite important, but writing that as the next wave is probably an overstatement.

Of course few shot learning is important for models, but for example for Pathways it was already part of the evaluation.

kappi4y ago

For industrial application, there are already mature systems based on CV. For majority of those applications, there is no need for deep learning or multilayer CNN. Shocked to see Andrew Ng talking like a marketing guy.

j / k navigate · click thread line to collapse

84 comments

58 comments · 13 top-level

notsag-hn4y ago· 12 in thread

kevsim4y ago

tablespoon4y ago

Not necessarily, at least on the first point. Someone could be getting coached.

2 more replies

strikelaserclaw4y ago

i mean, people are good at finding clever ways to cheat.

cloogshicer4y ago

Well, Ng is also one of those people who believe that we should all work 70+ hours per week:

https://news.ycombinator.com/item?id=15251769

mdp20214y ago

~80hrs on topic A squeezes the available time for being acquainted with the rest. [Edited because there was little way not to make the former formulation read, unwillingly, nasty]

Some of us believe instead on the advantage of being a polymath, (also) to be able to export wisdom from other contexts into the current work.

Also in terms of the proper ground to facilitate innovation.

1 more reply

nomilk4y ago

Musk recommends 80-100hr weeks, every week

Source: https://www.youtube.com/watch?v=GtaxU6DZvLs&t=1m20s

1 more reply

weego4y ago

It's literally our job to not just assume the possible solution that rolls off the top of our heads might not be the most up to date / best practice and to research it

mirntyfirty4y ago

mdp20214y ago

I'd just note that if pushed by circumstances (if one was willing to be interviewed in spite of their ways), the interview environment could be (would be) on a throwaway virtual machine...

Possibility which, by the way, makes the interviewer's cautionary move generally useless.

ThalesX4y ago

Or, in 2022, one could reach into their pocket just use a phone, making the interviewer's cautionary move generally useless.

1 more reply

kevinventullo4y ago

I suppose if you’re clever enough to set up a VM in order to evade detection, that’s a pretty positive aptitude signal in its own right (though pretty negative on the behavioral/ethics side).

1 more reply

tromp4y ago

Missed opportunity to say you landed an interview at LandingAI :-)

itissid4y ago· 8 in thread

That is the problem with generalization and cop outs like these. It's no good to people in the field doing actual work where the devil is in the detail.

vasco4y ago

Interestingly, when you learn how to drive you need to see approximately one example and you're able to identify them all.

teruakohatu4y ago

You brain already knows how to select the most important features of a sign. The shape, the size and the color. You have also learned how to understand the text on the sign.

A new born baby does not have that ability.

3 more replies

corndoge4y ago

1 more reply

itissid4y ago

It does feel though that a model like the human mind will be very fundamentally different from any of the models of today. No?

Like the NN State of the art models of today are so different from state of the art 12 or so years ago which was SVMs.

simulate-me4y ago

Your brain is also the result of billions of years of evolutionary "training." Neural nets start from scratch.

onethought4y ago

FTFY.

Yet Tesla have been working on both the hardware and software for 10 years? Amazing progress right?

2 more replies

IIAOPSW4y ago

Sounds like they missed the forest and instead "deep learned" all the variations of trees.

bennyg4y ago

Do you have a link for that Tesla talk?

whatever14y ago· 6 in thread

My understanding is that they are trying to automate the data preparation steps that seasoned ML practitioners are doing anyway today.

The fact that he tries this in manufacturing makes the case stronger. In most manufacturing companies you do not have access to top ML talent.

You have Greg who knows python and recently visualized some production metrics.

overkalix4y ago

Greg probably also knows SAS and AMPL, and has a good knowledge of ops research, which is within stone-tossing distance of whatever ML is pretending to be this week.

NumberCruncher4y ago

After 15 years of experience with SAS this sounds to me like saying "knowing how to write and having a pen makes you to a poet". But it depends on how far you can toss a stone...

whatever14y ago

OR and ML have their own space in manufacturing.

OR is perfect when you can describe explicitly what the decision space is and what the restrictions are.

1 more reply

andrewf4y ago

A tangent, if you have time: where would I go for a primer on operations research and/or discrete event simulation?

tuxguy4y ago

lol-ing at "Whatever ml is pretending to be this week"

so funny, because so accurate :)

GabeIsko4y ago

Visual inspection in manufacturing is a very solved problem, especially in the AI field. The big bucks are in pattern matching anyway... it's a dumb company.

TOMDM4y ago· 6 in thread

Yeah that'd be great.

I also want cars that run on salt water.

I'm not saying that small data ai is equally impossible, but simply saying "we should make this better thing" isn't enough.

Datenstrom4y ago

> simply saying "we should make this better thing" isn't enough.

It has also been my primary focus for the past 5 years and the core of the company I founded.

riku_iki4y ago

> it is possible to smartly choose the correct few samples for the network that yield the same results.

And then, someone is using pretrained 500B model, and fine-tuning your few examples, and getting new SOTA.

2 more replies

sanxiyn4y ago

It's more of "this direction seems higher ROI than that direction", in particular quality vs quantity of data.

Already in 2018 SenseTime reported that for face recognition, clean dataset surpasses accuracy of 4x larger raw dataset.

https://arxiv.org/abs/1807.11649

mdp20214y ago

«Small data /ai/» is not "impossible", it is actually necessary: AI, opposed to this ML, implies perfectioned digestion of the input data.

Only, the article seemed to show a very conservative Ng about the algorithms, a focus on data management - so it's still ML.

technocratius4y ago

I would say that Andrew Ng has some credibility in putting practice to his preaching.

atulsnj4y ago

Atleast someone's working on it.

a_square_peg4y ago· 4 in thread

Somewhat related, we have a saying in Korean – ‘you see as much as you know’.

mdp20214y ago

> more data equals

machiaweliczny4y ago

Anyone knows if this might be true mathematically speaking? Does order of data matters?

Longwelwind4y ago

mjburgess4y ago

Nope, it's never a function of data -- because data is always ambiguous. It is never possible just to infer the conceptual model of the data from the data alone.

1 more reply

DeathArrow4y ago· 3 in thread

Pretty interesting. Mr. Ng claims that for some applications having a small set of quality data can be as good as using huge set of noisy data.

I wonder if, assuming the data is of highest quality, with minimal noise, having more data will matter for training or not. And if it matters, on what degree?

frozenport4y ago

This is at the heart of the ML training problem.

Now, why not simply add everything? Well in general it takes too long to train.

pbowyer4y ago

> Typical practice is to find images whose inclusion causes high variation in final accuracy (under k-fold validation, aka removing/adding the image causes a big difference)

How do you identify these images? It sounds like I'd need to build small models to see the variance but I'm hoping that there's a more scientific way?

kavalg4y ago

It is relatively easy to turn small and accurate data to bigger and less accurate data with various forms of augmentation. The opposite is harder.

aj74y ago· 2 in thread

“I once built a face recognition system using 350 million images.”

Did this make any of you a little queasy?

mdp20214y ago

Well noted! Explicitly: where does such database come from?

mkl4y ago

Frames of video could make the number sky-high like that without involving enormous numbers of people.

1 more reply

leobg4y ago· 2 in thread

What are some ML data annotation tools that guide you towards those data points where the model gets confused? I hear Prodigy does this. Any others?

jstx14y ago

What's the role of these tools? Can't a developer just write the code to get those data points?

At a first glance it seems like the hassle of integrating such a product into an existing ML codebase/pipeline is larger than solving the problem by hand.

leobg4y ago

atbpaca4y ago· 1 in thread

spupe4y ago

tacosbane4y ago· 1 in thread

can we build an AI to detect that the AI goalposts keep getting moved?

girvo4y ago

A simple “return true;” should suffice, but to be honest that’s what makes the field fascinating to me as an outsider

a-dub4y ago

data quality is important. every ai project i've worked on has started with visualizing the data and thinking about it.

most ml is actually data engineering.

xiphias24y ago

I can imagine that customizing AI solutions in an automated way is quite important, but writing that as the next wave is probably an overstatement.

Of course few shot learning is important for models, but for example for Pathways it was already part of the evaluation.

kappi4y ago

j / k navigate · click thread line to collapse