How to explain gradient boosting (opens in new tab)

(explained.ai)

34 pointsparrt7y ago16 comments

16 comments

6 comments · 1 top-level

parrtOP7y ago· 5 in thread

Gradient boosting machines (GBMs) are currently very popular and so it's a good idea for machine learning practitioners to understand how GBMs work. The problem is that understanding all of the mathematical machinery is tricky and, unfortunately, these details are needed to tune the hyper-parameters. (Tuning the hyper-parameters is required to get a decent GBM model unlike, say, Random Forests.) Our goal in this article is to explain the intuition behind gradient boosting, provide visualizations for model construction, explain the mathematics as simply as possible, and answer thorny questions such as why GBM is performing “gradient descent in function space.” We've split the discussion into three morsels and a FAQ for easier digestion. Written by Terence Parr and Jeremy Howard.

nonbel7y ago

>"The problem is that understanding all of the mathematical machinery is tricky and, unfortunately, these details are needed to tune the hyper-parameters."

You don't need to understand anything about the math to run a random, or grid, or bayesian optimization, or whatever search of the hyperparameter space.

parrtOP7y ago

True, people use a grid search, but I am always very uncomfortable using things as black boxes. How does tree depth affect generality etc...? Effectively using a model means understanding your tools, in my view, but easy to get started w/o the math as you say!

3 more replies

uptownfunk7y ago

Ahh it’s because of people like you that I’ll always have a job. He’s right folks Please don’t try to understand the math behind it, actually the less you know the better.

2 more replies

s-shellfish7y ago

Can you find multiple mathematics foundations to explain it from?

I feel like the lack of connection from all the math requires oneself to understand all the math, which is very difficult to do.

Is there any way to explain gradient boosting via category theory?

parrtOP7y ago

I’m not sure about the connection to category theory. This is mostly an attempt to explain why this model works, that it is performing gradient descent in a particular space. We find that extremely challenging to explain to students. I would be interested to know if you feel the article helps in that regard. Thanks

2 more replies

j / k navigate · click thread line to collapse

16 comments

6 comments · 1 top-level

parrtOP7y ago· 5 in thread

nonbel7y ago

>"The problem is that understanding all of the mathematical machinery is tricky and, unfortunately, these details are needed to tune the hyper-parameters."

You don't need to understand anything about the math to run a random, or grid, or bayesian optimization, or whatever search of the hyperparameter space.

parrtOP7y ago

3 more replies

uptownfunk7y ago

Ahh it’s because of people like you that I’ll always have a job. He’s right folks Please don’t try to understand the math behind it, actually the less you know the better.

2 more replies

s-shellfish7y ago

Can you find multiple mathematics foundations to explain it from?

I feel like the lack of connection from all the math requires oneself to understand all the math, which is very difficult to do.

Is there any way to explain gradient boosting via category theory?

parrtOP7y ago

2 more replies

j / k navigate · click thread line to collapse