Analysis and forecasting of time series

Table of contents:

Analysis and forecasting of time series
Analysis and forecasting of time series

For many years, people have predicted weather conditions, economic and political events and sports results, recently this extensive list has been replenished with cryptocurrencies. For predicting versatile events, there are many ways to develop forecasts. For example, intuition, expert opinions, using past results to compare with traditional statistics, and time series forecasting is just one of them, while the most modern and accurate type of forecasts with a wide range of applications.

Time series method

Time series method

A time series (TS) method is a dataset that collects information over a period of time. There are special methods for extracting this type:

  • linear and non-linear;
  • parametric and non-parametric;
  • one-dimensional and multidimensional.

Forecasting timeseries brings with it a unique set of capabilities to meet today's challenges. Modeling relies on learning to establish the driving force behind data change. The process comes from long-term trends, seasonal effects, or irregular fluctuations that are characteristic of TS and are not seen in other types of analysis.

Machine learning is a branch of computer science where algorithms are compiled from data and include artificial neural networks, deep learning, association rules, decision trees, reinforcement learning, and Bayesian networks. A variety of algorithms provide options for solving problems, and each has its own requirements and trade-offs in terms of data input, speed, and accuracy of results. These, along with the accuracy of the final predictions, will be weighted when the user decides which algorithm will work best for the situation under study.

Time series forecasting borrows from the field of statistics, but gives new approaches to problem modeling. The main problem for machine learning and time series is the same - to predict new outcomes based on previously known data.

The target of the predictive model

Purpose of the predictive model

TS is a set of data points collected at regular intervals. They are analyzed to determine a long-term trend, to predict the future, or to perform some other type of analysis. There are 2 things that make TS different from a normal regression problem:

  1. They depend on time. Sothe basic assumption of a linear regression model that the observations are independent does not hold in this case.
  2. Along with an increasing or decreasing trend, most TSs have some form of seasonality, i.e. changes that are specific to a certain period of time.

The goal of a time series forecasting model is to give an accurate forecast on demand. The time series has time (t) as the independent variable and the target dependent variable. In most cases, the forecast is a specific result, for example, the sale price of a house, the sports result of the competition, the results of trading on the stock exchange. The prediction represents the median and mean and includes a confidence interval expressing a level of confidence in the range of 80-95%. When they are recorded at regular intervals, the processes are called time series and are expressed in two ways:

  • one-dimensional with a time index that creates an implicit order;
  • a set with two dimensions: time with an independent variable and another dependent variable.

Creating features is one of the most important and time-consuming tasks in applied machine learning. However, time series forecasting does not create features, at least not in the traditional sense. This is especially true when you want to predict the result several steps ahead, and not just the next value.

This does not mean that features are completely disabled. They should just be used with caution for the following reasons:

  1. Unclear what future realvalues ​​will be for these features.
  2. If objects are predictable and have some patterns, you can build a predictive model for each of them.

However, be aware that using predictive values ​​as features will propagate error into the target variable and lead to errors or biased predictions.

Time series components

Time series components

Trend exists when the series increases, decreases or remains at a constant level over time, so it is taken as a function. Seasonality refers to a property of a time series that displays periodic patterns that repeat at a constant frequency (m), for example, m=12 means the pattern repeats every twelve months.

Dummy variables similar to seasonality can be added as a binary function. You can, for example, take into account holidays, special events, marketing campaigns, regardless of whether the value is foreign or not. However, you need to remember that these variables must have certain patterns. However, the number of days can be easily calculated even for future periods and influence time series forecasting, especially in the financial area.

Cycles are seasons that don't happen at a fixed rate. For example, the annual reproduction attributes of the Canada lynx reflect seasonal and cyclical patterns. They do not repeat at regular intervals and may occur even if the frequency is 1 (m=1).

Lagged values ​​-lagging values ​​of a variable can be included as predictors. Some models, such as ARIMA, Vector Autoregression (VAR), or Autoregressive Neural Networks (NNAR), work this way.

The components of the variable of interest are very important for time series analysis and forecasting, to understand their behavior, patterns, and to be able to select the appropriate model.

Data set attributes

Dataset Attributes

The programmer may be used to entering thousands, millions and billions of data points into machine learning models, but this is not required for time series. In fact, it is possible to work with small and medium TS, depending on the frequency and type of variable, and this is not a disadvantage of the method. Moreover, there are actually a number of advantages to this approach:

  1. Such sets of information will correspond to the capabilities of a home computer.
  2. In some cases, perform time series analysis and forecasting using the entire data set, not just a sample.
  3. TS length is useful for creating graphs that can be analyzed. This is a very important point because programmers rely on graphics in the analysis phase. This does not mean that they do not work with huge time series, but initially they should be able to handle smaller TS.
  4. Any dataset that contains a time-related field can benefit from time series analysis and forecasting. However, if the programmer has a larger set of data, the DB (TSDB)may be more appropriate.

Some of these sets come from events recorded with timestamp, system logs, and financial data. Since TSDB works natively with time series, this is a great opportunity to apply this technique to large scale datasets.

Machine learning

Machine learning (ML) can outperform traditional time series forecasting methods. There are a ton of studies out there comparing machine learning methods to more classical statistical methods on TS data. Neural networks are one of the technologies that have been widely researched and apply TS approaches. Machine learning methods lead the rankings for data collection based on time series. These sets have proven to be effective, outperforming pure TS sets against M3 or Kaggle.

MO has its own specific problems. Developing features or generating new predictors from a dataset is an important step for it and can have a huge impact on performance and be a necessary way to address trend and seasonality issues of TS data. Also, some models have problems with how well they fit the data, and if they don't, they may miss the main trend.

Time series and machine learning approaches should not exist in isolation from each other. They can be combined together to give the benefits of each approach. Forecasting methods and time series analysis are good at decomposing data into trend and seasonal data.elements. This analysis can then be used as input to an ML model that has trending and seasonality information in its algorithm, giving the best of both worlds.

Understanding the problem statement

For an example, consider TS related to predicting the number of passengers on a new high-speed rail service. For example, you have 2 years of data (August 2016 - September 2018) and with this data you need to predict the number of passengers for the next 7 months, having 2 years of data (2016-2018) at the hourly level with the number of passengers traveling, and it is necessary to estimate the number of them in the future.

Subset of dataset for forecasting with time series:

  1. Creating a train and test file for simulation.
  2. The first 14 months (Aug 2016 - Oct 2017) are used as training data, and the next 2 months (Nov 2017 - Dec 2017) are test data.
  3. Aggregate the dataset on a daily basis.
Data set aggregation

Perform data visualization to see how it changes over a period of time.

Data visualization

Naive Approach construction method

The library used in this case for TS prediction is statsmodels. It must be installed before any of these approaches can be applied. Perhaps statsmodels is already installed in the Python environment, but it does not support methodsprediction, so you'll need to clone it from the repository and install it from source.


For this example, it means that the coin travel prices are stable from the very beginning and throughout the entire period of time. This method assumes that the next expected point is equal to the last observed point and is called Naive Approach.

Naive Method

Now calculate the standard deviation to test the accuracy of the model on the test dataset. From the RMSE value and the above graph, we can conclude that Naive is not suitable for high volatility options, but is used for stable ones.

Simple medium style

To demonstrate the method, a chart is drawn, assuming that the Y-axis represents the price and the X-axis represents time (days).

Simple Medium Style

From it we can conclude that the price increases and decreases randomly with a small margin, so that the average value remains constant. In this case, you can predict the price of the next period, similar to the average for all the past days.

This method of forecasting with the expected average of previously observed points is called the simple average method.

In this case, previously known values ​​are taken, the average is calculated and taken as the next value. Of course, this won't be exact, but it's pretty close, and there are situations where this method works best.

Simple Mediummethod

Based on the results displayed on the graph, this method works best when the average value over each time period remains constant. Although the naive method is better than the average, but not for all datasets. It is recommended to try each model step by step and see if it improves the result or not.

Moving Average Model

Moving average model

Based on this chart, we can conclude that prices have increased several times in the past by a wide margin, but are now stable. In order to use the previous averaging method, you need to take the average of all previous data. The prices of the initial period will strongly influence the forecast of the next period. Therefore, as an improvement over the simple average, take the average of prices only for the last few periods of time.

This forecasting technique is called the moving average technique, sometimes referred to as an "n" size "moving window". Using a simple model, the next value in TS is predicted to check the accuracy of the method. Clearly Naive outperforms both Average and Moving Average for this data set.

There is a variant of the forecast by the method of simple exponential smoothing. In the moving average method, the past "n" observations are equally weighted. In this case, you may encounter situations where each of the past 'n' affects the forecast in its own way. This variation, which weights past observations differently, is called the methodweighted moving average.

Extrapolation of patterns

One of the most important properties needed to consider time series forecasting algorithms is the ability to extrapolate patterns outside of the training data domain. Many ML algorithms do not have this capability as they tend to be limited to a region that is defined by the training data. Therefore, they are not suitable for TS, the purpose of which is to project the result into the future.

Another important property of the TS algorithm is the possibility of obtaining confidence intervals. While this is the default property for TS models, most ML models do not have this capability as they are not all based on statistical distributions.

Don't think that only simple statistical methods are used to predict TS. It's not like that at all. There are many complex approaches that can be very useful in special cases. Generalized Autoregressive Conditional Heteroscedasticity (GARCH), Bayesian and VAR are just some of them.

There are also neural network models that can be applied to time series that use lagging predictors and can handle features such as neural network autoregression (NNAR). There are even time series models borrowed from complex learning, particularly in the recurrent neural network family, such as LSTM and GRU networks.

Estimation Metrics and Residual Diagnostics

The most common prediction metrics areRMS means, which many people use when solving regression problems:

  • MAPE because it is scale independent and represents the ratio of error to actual values ​​as a percentage;
  • MASE, which shows how well the prediction is performing compared to the naive average prediction.

Once a forecasting method has been adapted, it is important to evaluate how well it is able to capture the models. While evaluation metrics help determine how close the values ​​are to actual values, they do not evaluate whether the model fits the TS. Leftovers are a good way to evaluate this. Since the programmer is trying to apply TS patterns, he can expect errors to behave like "white noise" since they represent something that cannot be captured by the model.

"White noise" must have the following properties:

  1. Residuals uncorrelated (Acf=0)
  2. Residuals follow a normal distribution with zero mean (unbiased) and constant variance.
  3. If either of the two properties is missing, there is room for improvement in the model.
  4. The zero mean property can be easily tested using the T-test.
  5. The properties of normality and constant variance are visually controlled using a histogram of residuals or an appropriate univariate normality test.


ARIMA - AutoRegressive Integrated Moving-Average model, is one of the most popular methods used in TS forecasting, mainlythrough data autocorrelation to create high-quality models.

When evaluating ARIMA coefficients, the main assumption is that the data is stationary. This means that trend and seasonality cannot affect the variance. The quality of the model can be assessed by comparing the time plot of the actual values ​​with the predicted values. If both curves are close, then it can be assumed that the model fits the analyzed case. It should disclose any trends and seasonality, if any.

Analysis of the residuals should then show if the model fits: random residuals mean it is accurate. Fitting ARIMA with parameters (0, 1, 1) will give the same results as exponential smoothing, and using parameters (0, 2, 2) will give double exponential smoothing results.

Time Series Algorithms in SQL Server

You can access ARIMA settings in Excel:

  1. Start Excel.
  2. Find XL MINER on the toolbar.
  3. On the ribbon, select ARIMA from the drop-down menu.

Summary of ARIMA Model Capabilities:

  1. ARIMA - Autoregressive Integrated Moving Average.
  2. Forecasting model used in time series analysis.
  3. ARIMA parameter syntax: ARIMA (p, d, q) where p=number of autoregressive terms, d=number of seasonal differences, and q=number of moving average terms.

Algorithms in SQL Server

Performing cross prediction is one of the importantfeatures of time series in forecasting financial tasks. If two related series are used, the resulting model can be used to predict the results of one series based on the behavior of the others.

SQL Server 2008 has powerful new time series features to learn and use. The tool has easily accessible TS data, an easy-to-use interface for simulating and reproducing algorithm functions, and an explanation window with a link to server-side DMX queries so you can understand what's going on inside.

Market time series is a broad area to which deep learning models and algorithms can be applied. Banks, brokers and funds are now experimenting with their deployment of analysis and forecasting for indices, exchange rates, futures, cryptocurrency prices, government stocks and more.

In time series forecasting, the neural network finds predictable patterns by studying the structures and trends of the markets and gives advice to traders. These networks can also help detect anomalies such as unexpected peaks, falls, trend changes and level shifts. Many artificial intelligence models are used for financial forecasts.

Popular topic