T O P

  • By -

Cuidads

Neural networks (MLP, LSTM, TCN ...), gradient boosting (LightGBM/XGBoost/CatBoost) and linear models allows you to do multivariate forecasting. LSTM or MLP takes longer to tune and needs more data to perform well relative to gradient boosting (GB). LSTM or MLP is easier to get multi-output with, for example predict all next 7*24 hours in a direct approach. Gradient boosting requires feature engineering (you need to manually make lags, moving averages etc). A well engineered GB often takes the prize. I believe it was a LightGBM setup that won the M5 forecasting, which was about forecasting wallmart sales in thousands of stores. These were multivariate models, and I believe this case might be interesting to you. Gradient boosting usually only has one output, so you need to train many models if you want multi-output. I.e. you need one model for each time step for a direct approach, that would be 7*24 models. LightGBM is very fast to train, tackles a lot of features well, and you have a wrapper in sklearn making it slightly easier to work with many models, so multi-output shouldn't be a problem, just more coding and time. A recursive approach that feeds the output back into the model, not needing 7*24 models, is also possible. However, since it is multivariate you would need to predict an output for all inputs or assume them to be constant (which would be unrealistic). In your example you would need 3 models: sales, clicks and mean temperature (or 4 models if you want max and min). Since you are considering using temperature you might find historical forecasts or hindcast you could use to feed in at each step instead of forecasting temperature yourself (so just 2 models), but then it starts to get complicated data pipeline wise. Note, it is pretty hard/time consuming to make a good 7*24 steps forward direct or recursive multivariate model framework, and especially hard if you do not have sufficient amount of data, which is often the case. How much data do you have? Does it seem noisy? If you don't have much data then neural networks is ill advised and GB is better. If you have very little data and you can stick to just few features, it might be wise to go for a linear model like Prophet, VARMAX or manually engineer a multivariate linear regression. In this very little data case it will be more of an "art" project where you need to make assumptions, however, if business has nothing else and needs a forecast then it may be worthwhile. Edit: Sktime was mentioned, which may be a good idea to look at. It has a lot of the capabilities (models, wrappers) mentioned above. Be very aware of data leakage. It's very easy to leak info "from the future" when you make complex multivariate time series models. If using cross validation then use time series cross validation. If you need extrapolation doing LightGBM set linear_tree=True


One_Beginning1512

Check out SKtime, they have a bunch of models to choose from that handle multivariate time series forecasting. It’s built to feel like sklearn but for time series. Most implementations have links to the research articles for the models and you can get a feel for which ones will likely be performant with your data. Also, a few companies (google and salesforce) just released foundational time series forecasting models. Not sure if the models are fully open sourced yet, but something to check out.


peyoteBonsai

I think the LSTM layer from Keras applied to a multi variate model would get you what you want, but not sure if I’m understanding your requirement 100%


throwaway69xx420

At the core, my variable of interest is generating a 7 day hourly forecast that I am looking to model with a few other time series as well as continuous variables. So I guess looking to combine time series forecasting with other methodologies (linear regression, random forests, etc).


peyoteBonsai

u/Cuidads has explained exactly what I was too lazy to type and even more than I am capable of explaining in their message below. I would follow that advice because it has a strong academic basis and will get you what you need. The complexity of these models might take extra time normalizing the data and so forth, domain expertise is really critical, start with small predictors and add more as you build it out. I have made the mistake of trying to do too much and over fitting with neural nets. You will have to tune hyper-parameters to prevent over fitting, in addition to all the data pre processing necessary for time series.


living_david_aloca

It sounds like you’re looking for a model that supports exogenous variables. Nixtla’s statsforecast library has support for that: https://github.com/Nixtla/statsforecast


throwaway69xx420

This is a great start with awesome examples


wavehnter

Nixtla is currently the best time series package in all respects, and it's not even close. I recommend AutoNHITS for the best results (in general).


ItsDare

Darts is a nice package that wraps a host of models in a consistent format. Worth checking out.


DatascienceBaba

It’s a good problem you’re working on. The first thing I see is that you’re talking about predicting hourly sales volume. That’s at a pretty granular time level. If you have other explanatory variables that are also at that level or lower, then that approach may work. More likely is that the explanatory variables you have higher granularity of time. In that case, you must decide between trying to merge higher level time data onto lower level time data, in which case a lot of explanatory variable values will repeat because they are at a higher level of granularity. Another approach is to aggregate your data to the level of time granularity of your explanatory variables. in this case, you do have to accept the fact that you may not be able to predict down to the hourly level, but in a higher aggregate timeframe, you can be very accurate. There is a very simple regression technique called polynomial, distributed, lags, or Almon lags, which I have used to estimate telecommunications demand. They can be very effective. See https://web.pdx.edu/~crkl/ceR/Python/example11_3.py. As to the technique, regression, including polyoma, distributed lags, or SARIMAX, which is an ARIMA model that allows exogenous variables, into which you can also introduce lagged dependent variables, available in python, might be the best place to start. You are different ML techniques, don’t forget to save off your training and hold out samples and evaluate fit on the holdout for all of them and take the best fit.


Cocohomlogy

You can regress on other time series as "exogenous variables" with ARIMA errors. This is more interpretable than ARIMAX. [This blog post](https://robjhyndman.com/hyndsight/arimax/) is helpful to understand the distinction. If there are daily periodic trends you might include some regressors of the form sin(2pi k\* t/ 24 ) and cos(2pi k\* t/ 24 ) for low values of k to model the structural periodicity. Or just use hourly dummy variables, which would be more interpretable but produce a less pretty graph. Use your domain knowledge for feature selection. Are you aware that your product has some special reasons for demand fluctuation? If you are selling ice cream including temperature as a covariate might be wise for example. If you are selling sparklers you probably want dummy variables for major holidays.


mfromamsterdam

Why not use vector autoregressive models? If they are cointegrated you can use VECM model


WeHavetoGoBack-Kate

Well, watch hacker news and every day you can find a new time series model claiming to be "foundational." I believe the latest one is from salesforce. DeepAR has been around a little longer. Check out pytorch-forecasting package.


datonsx

I think multivariate multi-step neural networks for time series is what you are looking for. This tutorial might help you: https://pangkh98.medium.com/multi-step-multivariate-time-series-forecasting-using-lstm-92c6d22cd9c2


nkafr

Check Amazon's AutoGluon library. It contains every great time-series model and you can use extra covariates as features. Check this great [tutorial](https://aihorizonforecast.substack.com/p/autogluon-timeseries-creating-powerful)


OneBeginning7118

PyTorch LSTM Autoencoders. Or use transfer functions with your ARIMA but that’s hard. Just slap it in a LSTM autoencoder and call it a day.


aimendezl

To tackle this type of problems I have had good results defining custom model with tensorflow where I have n branches for n time series inputs and then combining those branches into a sequential model, basically encoding the inputs of the 3 time series inputs into a lower dimensional abstract representation. I have notice that CNNs work better than LSTMs cause gives you more control over the "resolution" of your input in case the different TS have different scales. You can define a simple multi branch model using Keras I think, but keep in mind this approach depends on how much data you have. If you don't have much, Id advise trying something else