Choosing a Forecast Engine
What is time series forecasting?

Time series forecasting is the use of a mathematical model to predict future data points or events based on past data. Time series analysis is unique because there is a natural temporal ordering of the data points. A time series forecast engine implements a time series forecasting model. For example, a grocery store might use a time series forecast engine to analyze historical milk sales to predict how many gallons of milk will be sold by the store at a future date.

Should I use a forecast engine?

You may not need a forecast engine. If your data is easily predictable (stationary data) or accurate predictions do not have a significant benefit then a forecast engine may not be justified. A simple averaging of the historical data points or using an exponential smoothing algorithm may be good enough for the problem you are trying to solve. You can do a cost-benefit analysis to see if it makes sense for you to purchase a forecast engine to produce more accurate forecasts. We can help you with this analysis.

Are forecast engines all the same?

Forecast solutions vary widely and use different predictive techniques. There are linear and non-linear approaches. Most linear approaches are variations of the ARIMA (Autoregressive Integrated Moving Average) model. ARIMA models are more sensitive to missing or erroneous data points than other techniques. Typically there is an early data cleansing stage that conditions the data to help mitigate some of these issues. ARIMA solutions tend to be slower as the input data size increases or correlation data sets are added. Many of these solutions require third party libraries (ex. Fortran) and will not run on some hardware platforms.

Non-linear approaches work well for linear and non-linear data. Our non-linear forecasting solution is based on a highly customized Artificial Neural Network (ANN). Artificial Neural Network technology is a form of artificial intelligence that is inspired by the functioning of the human brain. By creating an interconnected group of artificial neurons an adaptive system can be created that provides an excellent non-linear statistical data modeling tool that can be used for regression analysis (data mining of patterns). Similar technologies are used in biometric systems such as finger print and face recognition. These types of systems have improved greatly over the last decade.

What criteria should I use to evaluate a forecast engine?
There are several criteria you should consider when evaluating a forecast engine:
  • Forecast Accuracy
  • Speed
  • Ease of Integration
  • Self-Tuning
  • Portability
  • Reliability
  • Scalability
  • Maintenance Costs
  • Automatic Holiday Support
  • Support for Independent Series
The importance of each one of these criteria varies based on your implementation and the problem you are trying to solve. We can help you do a cost-benefit analysis to determine how to weight these evaluation factors for your custom implementation.
Forecast Accuracy

The accuracy of a time-series forecast is usually measured by its WMAPE (Weighted Mean Absolute Percent Error). This formula represents the accuracy of the forecast as a percentage and can be used to compare different forecast solutions.

When gaging the importance of the forecast accuracy try to determine the actual savings for every degree of additional accuracy. Many forecast engines can be tuned to produce better forecasts and a few are self-tuning. You want to determine your accuracy target based on the actual cost savings to avoid a diminishing returns situation.

Speed

The speed of the forecast engine can be a critical factor depending on forecast frequency and the number of forecasts being produced. The hardware platform and system configuration (i.e. other programs running on the box) can affect the performance of the forecast engine.

It is best to test the forecast engine on your own data in an environment as similar to your production environment as possible.

Ease of Integration

There is a cost involved in integration. You need to evaluate how well the forecast engine fits into your design and how much effort and cost will be involved in the integration process. The forecast engine should fit seamlessly into your application and complement it's design, not over complicate it.

When you integrate the forecast engine with your software does it complicate the design and make it more rigid? If it does the overall design is more fragile. This makes it harder to adapt your application in the future. Software design tends toward complexity. You want your initial starting point to be a clean and simple design.

Some companies tout how easily their solutions are to integrate into your environment. It’s best to take a “show me” attitude and see if the products integrates as well as advertised. Most forecast engines are designed to be separate entities that do not integrate transparently into your application.

Self-Tuning

A self-tuning or adaptive forecast engine which requires minimal configuration is ideal for most circumstances. This ability is important because it is often impractical to hand tune and continually manually adjust forecasting parameters for every entity being forecast. It is better if the forecast engine is able to learn and adapt to each input data automatically.

Portability

The forecast engine should run on the same hardware platform as your application with as few external library dependencies as possible. Ask the vendor about the supported hardware platforms and how the forecaster is configured. For example, if it is installed as a separate service on another machine there may be a lot of communication and security issues you’ll need to consider in your design.

It is best if the forecast engine is self-contained and has as few external library dependencies as possible. Complexity increases in relationship to the number of external libraries required to run your application. Problems often occur when different versions of common libraries are installed on the same machine. Libraries written for specific platforms will perform differently on those platforms and can introduce compatibility issues and bugs when porting between platforms.

Reliability

The best way to assess the reliability of a forecast engine is to stress test it in your own environment on your own data. If the software fails under load or produces bad forecasts then you need to evaluate what the cost of this impact will be if your application was running in production. Is your application mission critical to your company? What is the cost of an erroneous forecast to your company?

Scalability

Scalability is the ability to handle greater future demands. Many forecast solutions do not scale well due to older architectures or dependence on non-reentrant libraries. The architecture of the forecast engine you select for your application should allow multiple running instances and should run well on newer hardware that often has multiple CPUs.

Maintenance Costs

The ongoing cost of maintenance is usually a function of the complexity of the application. This cost continues for the life of the application. This is a hidden cost that is often overlooked when evaluating a forecast engine. Maintenance costs endure for the life of the application and can be substantial. If the forecast engine greatly complicates your design then the long term cost of maintaining and changing that software will increase. If it is difficult to modify and evolve your application (design rigidity) its useful life span can be shortened.

Another factor of maintenance is how well the forecast engine adapts over time. If the forecast engine has to be manually tuned then there is a cost and overhead associated with that tuning. In some situations it might make sense for a company to hire an expert in regression analysis to constantly tune a forecast engine. These cases are rare and only make sense when the manually tuned forecast engine significantly outperforms the self tuning forecast engine.

Automatic Holiday Support

Most forecast engines do not automatically handle national holidays. Holidays are considered calendar events that the application needs to provide to the forecast engine. This increases the amount of work in the application domain and makes integration more difficult. It can also cause errors and result in inaccurate forecasts.

Support of Independent Series

An independent series in this case is additional input data that you suspect might influence the forecast. For example, if you are forecasting future sales data for a product based on sales history for that product you would want to include an input series to represent the days when you had discounted that product (i.e. the product was on sale). The forecast engine should be able to recognize correlations between the input data series and the forecasts will improve for both the sales promotion days and the non-sales promotion days. If the forecast engine determines there is no correlation for a new independent series it should be ignored.

Can I take it for a test drive?

You wouldn’t buy a car without taking it for a test drive. You shouldn’t buy a forecast engine without trying it out either. You should test drive the forecast engine you plan to purchase on your own data before purchasing the product. We’ll help you integrate our forecast engine into your product so that you can see how it performs in your environment, on your data. Instead of getting a few sample forecasts that have been hand tuned you will be able to see how the forecast engine performs right out of the box on your own data.