- Deep learn everything!
vs.
- I took at look at the variable distributions, went with a forest model after transforming some of the data.
[1] http://pages.stern.nyu.edu/~wgreene/DiscreteChoice/Readings/...
Logistic regression in particular has many features which provide more information about feature importance or lack thereof and many metric to confirm model quality, and it is disappointing to see this post only do a high-level overview. Yes, it may be a Google trade secret, but there has to be give-and-take.
> it is disappointing to see this post only do a high-level overview. Yes, it may be a Google trade secret, but there has to be give-and-take.
Why exaclty? This is not an academic paper. People who get this feature to show up might be curious about how it works, and 99.9% of them won't understand anything about independence of features, train/test split, etc. Worse, they would find the article too boring and technical. Just knowing that it is powered by a ML algorithm (and not some human input) is enough. I'm not sure why there has to be a give-and-take.
The fact that they put links to wikipedia for what Logistic regression is should give a good idea of the intended audience of this blog post.
> The fact that they put links to wikipedia for what Logistic regression is should give a good idea of the intended audience of this blog post.
The Wikipedia page on logistic regression is an order of magnitude more technical than this blog post.
It's not that any of those things are necessarily true, it's just that I'm used to people at least trying to make a plausible case that they weren't.
That should read, "from users who did not disable the on-by-default sharing of their location data"
I assume dispersion of parking locations is the distance from parking location to destination? I would have liked to see more about what kinds of inputs they used and how they cleaned them up to account for the confounding factors they mention (public transit users, private parking.)
I would guess its the density of parking locations in a given area, rather than distance to destination?
This shows pretty clear that we shouldn't try to accommodate cars as much as possible when there already is good public transport at a certain location.
Related techniques and how to implement them are covered in the first 2 weeks. While a lot more is going on in this system, one could call the core of the system that does this estimation "simple" for the field.