I was wrong.
More recently Machine learning has really enhanced what you can do with regression. For example multivariate regressions when there are non-linear (or partially linear) relationships between feature and target variables.
For example recent regression problem involved a chemical reaction. It was suspected that a particular feature above a threshold began to display non linear behavior but it was difficult to pinpoint exactly where it began departing from linearity. ML was very helpful analyzing this.
Other than regressions and timeseries forecasting I think it's worth knowing about K-means clustering and PCA (Principal Component Analysis)/ PLS (Projection to latent structures) as well.
I've found PCA to be pretty unknown but very useful I've had success using it in the past and found it useful to explain the relationship not just between the data features and the target variable but also how the features relate to each other.
Taking bearing temperature as an example, I think I will identify periods of time where the machine has already been generating for an hour so temperature have stabilized and then I will have bearing oil inlet temperature and machine load as independent variables, and bearing oil outlet and bearing metal temperatures as dependent values. Seems like it should be straightforward to find any anomalies but I’ve just started googling how to do this yesterday. There are lots of vendors hawking predictive maintenance software but I can’t imagine that I couldn’t get similar results with a few weeks effort and armed with Python and all of the associated libraries
edit: I've Also seen a lot of pitches about predictive maintenance / automated anomaly detection. I think the appeal lies in having a one size fits all solution you can apply to multiple pieces of equipment (fans, conveyor belt drives, pumps etc) and not needing to develop/deploy/maintain bespoke models.
A lot of manufacturing sites won't have a data person on tap (or even people who can write python). Also there are challenges with deployment etc especially in remote sites where access is difficult, data connectivity is bad etc (think like oil/gas pipelines). Most of the pitches seem to combine running ML models and using some kind of iot device with something like lorawan for connectivity..
That said, there can be a pretty big gap between detecting individual sensor anomalies (undergrad homework) and predicting component failure (build an entire business around it). I have never regretted starting a data project with a small, easy task, and ramping up from there. Whereas I have definitely regretted starting a data project with big goals and/or fancy techniques at the beginning. Set clear incremental goals, and use the early prototyping phases to explore the data and develop a good understanding for what might or might not be possible to accomplish with it.
In your example, you could not only pinpoint departure from linearity, but you could get a 95% confidence interval for it.
The best implementation is mgcv in R; pyGAM in python is ok but lacks many of the more advanced features in mgcv. There's even a more ML-flavored implementation in mboost
https://learn.microsoft.com/en-us/azure/machine-learning/com...
Once the model has run it uses something called a mimic to generate model explainability, which lets you explore things like feature importance etc in the final model. As far as the user interface goes I mostly used SAS in the past and it feels quite similar.