undefined | Better HN

0 pointshackerlight2y ago0 comments

Okay. And we can also say that there's some time series models that aren't regression models, right? For example, Kalman Filter is a "model" of a time series but isn't a regression.

0 comments

4 comments · 1 top-level

nerdponx2y ago· 3 in thread

Correct.

Although the term "regression" is a misnomer anyway, and often when people say "regression" they mean "linear model". And by "linear model", we mean specifically a model in which outputs/predictions are some fixed linear combination of the input.

It is however possible to interpret the Kalman filter as a kind of dynamic regression model. Check out here if you want a good math workout on that topic: https://stats.stackexchange.com/q/330696

(Another somewhat distinct meaning of the term "regression" is any model with a "continuous" outcome variable. This is usually in contrast to "classification", which is any model that has a "categorical" or discrete outcome variable.)

hackerlightOP2y ago

I have a time series forecasting methodology question that I'll drop here.

Suppose I have exogenous variables that vary over time, X(t). X is about 100 features. What are some methods I can apply onto X(t) to automatically engineer features that may be useful at predicting some noisy y(t)?

I want to simultaneously capture interactions/interdependence between the columns of X, as well as the autocorrelation structure of X.

If I treat X as merely tabular data, throwing it into a traditional regression model (e.g. XGBoost), it can capture the interdependence structure in X, but it will neglect the autocorrelation structure... Unless I manually engineer features that capture the autocorrelation structure in X (e.g. rolling/shifted/differenced features), but I want to explore methods that do that automatically.

nerdponx2y ago

It might not be that important to fully capture the autocorrelation structure within X.

Usually our models are doing something like "Y = f(X) + E" where E is some unknown random noise and f() is the relationship that we are trying to infer from the data. We usually take X as "given" or "known", so in that case we are looking at Y conditional on some specific value of X.

If we are just trying to make good predictions, then we don't necessarily care about the structure among the components of X unless that structure tells us something about how Y is affected by X.

Imagine the following "true" relationships in the data, where E and H are unmeasurable random noise:

  Y(t) = b0 + b1 * X(t) + b2 * X(t-1) + E(t)
  X(t) = c * X(t-1) + H(t)

Knowing b0, b1, and b2 is sufficient to predict "Y minus random noise". Knowing c doesn't help us at all.

If you're interested in obtaining good-quality estimates of b1 and b2, then you'll have a problem. That's because the direct effect of X(t-1) on Y is conflated with the indirect effect of X(t-1) on Y via X(t). But if you're just trying to make good predictions for Y, then you don't care as much about confidently distinguishing between b1 and b2.

disgruntledphd22y ago

if the variables in X(t) have the same time steps, I'd probably look at the cross correlation function of the X vs y, and then built another model on the X to predict X(t+n) and use that as an input for Y(t).

1 more reply

j / k navigate · click thread line to collapse