Professional Documents
Culture Documents
Research Article
Forecasting of Chinese E-Commerce Sales: An Empirical
Comparison of ARIMA, Nonlinear Autoregressive Neural
Network, and a Combined ARIMA-NARNN Model
Received 2 February 2018; Revised 25 June 2018; Accepted 27 August 2018; Published 19 November 2018
Copyright © 2018 Maobin Li et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
With the rapid development of e-commerce (EC) and shopping online, accurate and efficient forecasting of e-commerce sales (ECS)
is very important for making strategies for purchasing and inventory of EC enterprises. Affected by many factors, ECS volume range
varies greatly and has both linear and nonlinear characteristics. Three forecast models of ECS, autoregressive integrated moving
average (ARIMA), nonlinear autoregressive neural network (NARNN), and ARIMA-NARNN, are used to verify the forecasting
efficiency of the methods. Several time series of ECS from China’s Jingdong Corporation are selected as experimental data. The
result shows that the ARIMA-NARNN model is more effective than ARIMA and NARNN models in forecasting ECS. The analysis
found that the ARIMA-NARNN model combines the linear fitting of ARIMA and the nonlinear mapping of NARNN, so it shows
better prediction performance than the ARIMA and NARNN methods.
research shows that the ARIMA-NARNN hybrid model is the stock keeping unit (SKU) level of drug products and
superior to the ARIMA and NARNN models for the ECS we then proved that the effect of the hybrid model is better than
selected. This research has certain reference value to the EC either model alone. Weng Y and Feng H [15] established the
enterprises in the forecast of ECS, demand control of stock, BP neural network using variables such as the number of
and development of logistics strategy. users, the conversion rate, the unit price, and the number of
In fact, ARIMA, NARNN, and ARIMA-NARNN have collections and verified the accuracy and validity of the model
been studied in many industries, such as agriculture and by using sales data from Alibaba.
forestry [2], healthcare [3, 5], geography [4], manufacturing The above methods do have effects on the prediction of
[6], and offline retail [7]. Some of these studies [2–5] only ECS. The causal regression and machine learning methods
analyzed a single time series to reach conclusions, and some require a large number of explanatory variables such as the
[6, 7] only conducted empirical analysis of the hybrid model number of clicks and visits, the favorable rate, and even the
and did not compare the ARIMA, NARNN, and ARIMA- consumer information. However, there are many kinds of
NARNN to prove the effectiveness of the hybrid model. products involved in the e-commerce sales, and the sales of
The innovativeness of this paper is to do a comparative different kinds of products are affected by different factors.
study between the ARIMA, NARNN, and ARIMA-NARNN Therefore, it is a waste of time and energy to establish a
methods combining the characteristics of the e-commerce forecasting model by collecting a large number of explanatory
industry. The empirical analysis of multiple time series of variables, and it also does not meet the requirement of a quick
multiple e-commerce products is used to verify the universal prediction, whereas the time series model only needs time as
superiority of the ARIMA-NARNN model we propose. an explanatory variable, and if it can achieve an acceptable
The remainder of this paper is organized as follows. prediction accuracy, it will be very suitable for prediction
Section 2 is about the relevant literature review. ARIMA and of ECS. Therefore, we selected the time series method to
NARNN are introduced, and the ARIMA-NARNN model predict.
for forecasting of ECS is proposed in Section 3. Related Previous studies show that ARIMA and NARNN
description of the real case study is presented in Sec- approaches have better prediction performance in time
tion 4. The conclusion and further research directions are in series prediction [16–24]. For example, Cheng C and Qin
Section 5. P [16] used the ARIMA model to predict the time series
data of settlement of seawalls and got higher accuracy than
2. Literature Review the gray prediction. Also, Bonetto R and Rossi M [24]
used the NARNN model to predict the power demand and
There are a large number of studies on the prediction of the get a good predictive effect. In time series prediction of
sales volume of e-commerce products which mainly include ECS, currently, most scholars [16–25, 25–31] mainly use the
time series, causal regression, and machine learning methods. methods of moving average, exponential smoothing, time
On time series forecasting, Dai W et al. [8] studied the regression, or ARIMA alone, while the ARIMA, NARNN,
prediction of the clothing sales volume in Taobao based on and ARIMA-NARNN combined models have not yet been
structure time series model and Taobao search data. In this used to predict and analyze comparisons in the e-commerce
research, they explored a real-time prediction method of industry.
clothing sales volume using structure time series model and However, in the e-commerce industry, the types of prod-
website search data. For the regression prediction, Peng Geng ucts are very numerous; that is to say, there are more than
[9] predicted the e-commerce transaction volume based on one time series to be predicted. Moreover, the sales volume
online keywords search, word frequency, and other data as of e-commerce products fluctuates greatly and is easy to be
well as the classification of commodities. Jian-hong YU et al. affected by many factors such as price, promotion, ranking,
etc. In addition, the prediction of e-commerce requires
[10] studied Amazon’s sales forecast. Based on the historical
timeliness. Therefore, the mechanism and approaches for
data of Amazon, the authors used exponential smoothing,
predicting e-commerce sales should be reformulated.
time series decomposition, and ARIMA models to predict
sales. They found that the exponential smoothing forecast is
the worst, while the ARIMA is best. As for machine learn- 3. Methodology
ing prediction, Yu Miao [11] studied the extreme learning
machine (ELM) neural network, taking into account seasons, 3.1. ARIMA for ECS Forecasting
categories, holidays, and other factors in the forecasting 3.1.1. Basic Theory and Assumptions. ARIMA (p, d, q) is
of medium-term sales of clothes. Weng Yingjing [12] used called autoregressive integrated moving average [25]. ARIMA
a back propagation (BP) neural network to predict sales (p, d, q) is used in the e-commerce sales forecasting to
online. Philip Doganis et al. [13] constructed a framework build the ECS-ARIMA forecasting model, where AR is an
of combining genetic algorithms with a radial basis function autoregressive and p is an autoregressive term, MA is moving
(RBF) neural network model to analyze and predict sales of average, q is the moving average term, and d is the number
perishable food, and its predictive performance and efficiency of differentials made when the time series of ECS become
were demonstrated in the practice of several large companies stable.
in the Athens dairy industry. Franses PH [14] combined The basic assumptions of the ECS-ARIMA forecasting
expert prediction and model prediction to accurately predict model are as follows:
Mathematical Problems in Engineering 3
Hj
1 b b
1
l = 17 1
Figure 1: Example of the NARNN with one input, one hidden layer with 17 neurons, and one output layer with one output neuron and one
output.
(1) The time series of ECS follow the basic assumptions of sequence by one or more differences. If a time series of ECS
the traditional ARIMA model. {𝑌𝑡 } is transformed into a stationary sequence {𝑊𝑡 } with d
(2) Abnormal data, including promotions, will be dis- differences, {𝑌𝑡 } is a nonstationary sequence of d order. The
carded or smoothed out. ARMA (p, q) process is established for {𝑊𝑡 }; in this way, {𝑊𝑡 }
(3) The time series of ECS can become a stationary is a ARMA (p, q) process, and {𝑌𝑡 } is an ARIMA(p, d, q)
sequence with a finite difference. process.
(4) Presales of EC are not considered in the ECS-ARIMA
forecasting model. 3.1.6. Seasonal Autoregressive Integrated Moving Average.
(5) The e-commerce company as a research object oper- When the time series of e-commerce sales are both trendy
ates continuously. and seasonal, the series has a correlation that is an integer
The ECS-ARIMA forecasting model includes the moving multiple of the seasonal period. It requires that some appro-
average process (MA), the autoregressive process (AR), the priate stepwise differencing and seasonal differencing of the
autoregressive moving average process (ARMA), and the series is usually performed to make the series stationary,
ARIMA process, depending on whether the time series of which should adopt the SARIMA(p, d, q)(P, D, Q)s model
ECS is stable or not and what the regression contains. for this kind of time series, where P, Q are the seasonal
autoregressive and moving average orders, D is the seasonal
3.1.2. Moving Average: MA (q). A qth-order moving average differencing order, and s is a seasonal cycle. p,d,q are same as
process: MA (q) is expressed as follows: the ARIMA (p,d,q) mentioned in Section 3.1.1.
The (S)ARIMA model is good at linear fitting and fore-
𝑌𝑡 = 𝑢 + 𝜀𝑡 + 𝜃1 𝜀𝑡−1 + 𝜃2 𝜀𝑡−2 + ⋅ ⋅ ⋅ + 𝜃𝑞 𝜀𝑡−𝑞 (1) casting, because it is both linear combination of the historical
where 𝑌𝑡 is the current value of ECS, u is a constant term, data set residuals and the linear regression of the time series
{𝜀𝑡 } is the white noise sequence of ECS, and 𝜃1 , 𝜃2 , . . . , 𝜃q are lag items, no matter in the MA, AR, or ARMA process.
the moving average coefficient.
3.2. NARNN for ECS Forecasting
3.1.3. Autoregression: AR (p). A pth-order autoregressive
process: AR (p) is expressed as follows: 3.2.1. Basic Theory. NARNN is called nonlinear autoregres-
sive neural network [26]. The NARNN is used in forecasting
𝑌𝑡 = 𝑐 + 𝜙1 𝑌𝑡−1 + 𝜙2 𝑌𝑡−2 + ⋅ ⋅ ⋅ + 𝜙𝑝 𝑌𝑡−𝑝 + V𝑡 (2) of ECS to build an ECS-NARNN forecasting model. The
model can continuously learn and train based on past values
where 𝑌𝑡−1 , 𝑌𝑡−2 , . . . , 𝑌𝑡−𝑝 , are, respectively, the value that of a given time series of ECS to predict future values,
lag 1st-order, 2nd-order, . . ., and pth-order of the ECS time which has good memory function. The components of the
series, c is a constant term, and {V𝑡 } is a white noise process. ECS-NARNN model include the input neuron(s), the input
layer(s), the hidden layer(s), the output layer(s), and the
3.1.4. The Autoregressive Moving Average: ARMA (p, q). If the output neuron(s). The basic framework is as shown in
MA (q) process is merged with the AR (p) process, the ARMA Figure 1.
(p, q) process can be obtained, which is in the form as follows: Figure 1 shows an example NARNN, y(t) is the input
𝑌𝑡 = 𝑐 + 𝜙1 𝑌𝑡−1 + 𝜙2 𝑌𝑡−2 + ⋅ ⋅ ⋅ + 𝜙𝑝 𝑌𝑡−𝑝 + 𝜃1 𝜀𝑡−1 and output of the neural network, that is the time series
(3) of e-commerce sales, 𝐻𝑗 is the output of hidden layer, 1:
+ 𝜃2 𝜀𝑡−2 + ⋅ ⋅ ⋅ + 𝜃𝑞 𝜀𝑡−𝑞 + 𝜀𝑡 n represents the delay order (1: 3 shown in the figure), in
which the 1: n can be calculated by formula and obtained by
where the meaning of each parameter is the same as those constantly trying, w is the link weight, b is the threshold, 𝑙 is
mentioned above. the number of hidden layer neurons, and f is the activation
function of the hidden and output layer.
3.1.5. Autoregressive Integrated Moving Average: ARIMA(p, The basic assumptions of the ECS-NARNN forecasting
d, q). A time series can be transformed into a stationary model are as follows:
4 Mathematical Problems in Engineering
(1) The time series of ECS follow the basic assumptions of samples (𝑦(𝑡𝑛 −𝑑), . . . , 𝑦(𝑡𝑛 −2), 𝑦(𝑡𝑛 −1)), a new sample y(𝑡𝑛 +
the traditional NARNN model. 1) can be obtained for the two-step prediction value y(𝑡𝑛 + 1);
(2) Other basic assumptions are the same as the assump- then we use the nonparametric model to get the new samples
tions (2) - (5) in Section 3.1.1 of ECS-ARIMA model. (𝑦(𝑡𝑛 − 𝑑), . . . , 𝑦(𝑡𝑛 − 2), 𝑦(𝑡𝑛 − 1), y(𝑡𝑛 ), y(𝑡𝑛 + 1)), repeating
It can be described as follows: this continuous cycle until k is reached. Since the information
contained in y(𝑡𝑛 ), y(𝑡𝑛 + 1), . . . , y(𝑡𝑛 + 𝑘 − 1) is used for
𝑦 (𝑡) = 𝑎0 + 𝑎1 𝑦 (𝑡 − 1) + 𝑎2 𝑦 (𝑡 − 2) + 𝑎3 𝑦 (𝑡 − 3) + ⋅ ⋅ ⋅ the prediction of the kth step (k>1), the recursive prediction
(4)
+ 𝑎𝑛 𝑦 (𝑡 − 𝑛) + 𝑒 (𝑡)) method is better than the direct prediction method.
It can be seen that the ECS-NARNN can perform non-
where 𝑦(𝑡), 𝑦(𝑡 − 1), 𝑦(𝑡 − 2), 𝑦(𝑡 − 3), . . . , 𝑦(𝑡 − 𝑛) are, linear autoregressive prediction based on time series lag
respectively, the value lag 0-order, 1st-order,. . ., and nth- values, and the NARNN and all-regression neural network
order of the ECS sequence and e(t) is white noise. It can be can transform each other, so the ECS-NARNN has significant
seen from the equation that the output at the next moment performance in nonlinear mapping and prediction.
depends on the last n moments.
Based on the principle of autoregression, the following 3.3. The ARIMA-NARNN Combined Model. Based on the
NARNN are used in the model: above basic theories of the ARIMA and NARNN models, a
time series of ECS 𝑦𝑡 can be considered as comprising a linear
𝑦 (𝑡) = 𝑓 (𝑦 (𝑡 − 1) , 𝑦 (𝑡 − 2) , 𝑦 (𝑡 − 3) , . . . , 𝑦 (𝑡 − 𝑛)) (5) autocorrelation structure 𝐿 𝑡 and a nonlinear component 𝑁𝑡 .
Therefore, 𝑦𝑡 can be expressed as
Delay in the ECS-NARNN is the time delay of the output
signal. Because it is a regression based on its own data, the 𝑦𝑡 = 𝐿 𝑡 + 𝑁𝑡 (8)
ECS-NARNN takes the output time delay as the input of the The basic assumptions of the ARIMA-NARNN combined
network and calculates the output of the network from the model are as follows:
hidden layer and the output layer. The input signal of network (1) The ECS-ARIMA model can fully extracted the linear
is represented by 𝑦𝑖 . The hidden layer calculates the output 𝐻𝑗 component of the time series of ECS.
of each neuron based on the connection weight 𝑤𝑖𝑗 and the (2) The fitting residues of the ECS-ARIMA model contain
threshold 𝑏𝑖 between the input data and hidden layer neurons. a great deal of the nonlinear component in the time series of
ECS.
𝑛
(3) Other basic assumptions are the same as the assump-
𝐻𝑗 = 𝑓 (∑𝑤𝑖𝑗 y𝑖 + 𝑏𝑗 ) 𝑗 = 1, 2, . . . , 𝑙 (6)
𝑖=1
tions (2)-(5) in Section 3.1.1 of ECS-ARIMA model.
The steps of the ARIMA-NARNN combined model to
where 𝐻𝑗 is the output of the jth neuron in hidden layer, predict are as follows.
i is the ith input e-commerce sales data, n is the number of
input e-commerce sales, j is the jth hidden layer neuron, 𝑙 Step 1. The linear component 𝐿̂ 𝑡 is predicted by modeling and
is the number of hidden layer neurons, f is the activation forecasting the true time series of ECS 𝑦𝑡 by using the ARIMA
function of the hidden layer, 𝑦𝑖 is the input of the ECS time model.
series of the network, 𝑤𝑖𝑗 is the connection weight between
the ith output delay signal and the jth neuron of the hidden Step 2. The predicted residuals of the ARIMA model are
layer, and 𝑏𝑖 is the threshold of the jth hidden neuron. obtained by
The output layer gets the output 𝑦(𝑡) of the network ̂𝑡
𝑒𝑡 = 𝑦𝑡 − 𝐿 (9)
through a linear calculation based on the output 𝐻𝑗 of the
hidden layer. The sequence 𝑒𝑡 implies a nonlinear relationship in the
original time series:
𝑙
𝑒𝑡 = 𝑓 (𝑒𝑡−1 , 𝑒𝑡−2 , . . . , 𝑒𝑡−𝑛 ) + 𝜀𝑡 (10)
𝑦 (𝑡) = 𝑓 (∑𝐻𝑗 𝑤𝑗 + 𝑏) (7)
𝑗=1 where 𝜀𝑡 is a random error, 𝑒𝑡−1 , 𝑒𝑡−2 , ⋅ ⋅ ⋅, 𝑒𝑡−𝑛 are, respec-
tively, the value lag 1st-order, 2nd-order, . . ., and nth-order of
where 𝑦(𝑡) is the output of the network, f is the activation 𝑒𝑡 and 𝑓 is the nonlinear autoregressive function.
function of the output layer, 𝑤𝑗 is the connection weight
between the jth neuron in the hidden layer and the neuron Step 3. The NARNN model is used to approximate the
in the output layer, and 𝑏 is the threshold of output layer nonlinear function 𝑓, and then, the sequence 𝑒𝑡 is predicted
neurons. with the NARNN model. We set the predicted result as 𝑁 ̂𝑡 .
3.2.2. Forecasting of the ECS-NARNN Model. NARNN pre- Step 4. Combining the two models, the final prediction result
dictions in ECS-NARNN are recursive. The main idea of this of the ARIMA-NARNN combined model is as follows. 𝑦̂𝑡 is
method is to recycle 1st-step forward predictive value. The the predicted sequence of ECS time series data:
basic idea is that, as for the NARNN model y(𝑡𝑛 ) = 𝑓(𝑦(𝑡𝑛 − ̂𝑡 + 𝑁
̂𝑡
𝑦̂𝑡 = 𝐿 (11)
1), 𝑦(𝑡𝑛 − 2), . . . , 𝑦(𝑡𝑛 − 𝑑)), the first prediction value y(𝑡𝑛 )can
be obtained when k=1; when it is added to the original The above process is shown in Figure 2.
Mathematical Problems in Engineering 5
Figure 4: Autocorrelation and partial autocorrelation functions graph of the first-order differential sequence.
100
the number of hidden layer neurons from 5 to 40, respectively
(debugging shows that when the number of neurons exceeds 50
0 5 10 15 20 25
40, network training time will become longer). The analysis Week
shows that the goods needed to predict for Jingdong is too Actual value
much, not suitable for an extended training time, that is, after Prediction from ARIMA
more than 40, model predictive performance improvement
Figure 5: Real values and ARIMA model predictive values.
is not enough to make up for the training time loss. Since
input weights and thresholds directly affect the performance
of the neural network, each model is trained 10 times,
and the root mean square error (RMSE) and the decision finally get the best number of hidden layer neurons of
coefficient 𝑅2 of training results are recorded as shown in 17. Network training and debugging results are shown in
Figure 6. Figures 7 and 8:
From Figure 6, as the number of hidden neurons From Figure 7, only the confidence interval of the error
increases, RMSE decreases firstly, and 𝑅2 increases. A change autocorrelation coefficient exceeds 95% when the delay lag
to the opposite direction begins to occur after reaching order is 0. The correlation coefficients of the other orders
20, indicating that, as the number increases, the tendency are within 95% confidence intervals and then the relevance
to coordinate results in reduced ability to generalize and of information has been fully extracted. This illustrates this
Mathematical Problems in Engineering 7
R^2
MSE
20
0.75 0.7312
15
0.7
10
0.65
5
0 0.6
5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40
Hidden layer neurons number Hidden layer neurons number
300
1000
200
500
100
0
0
−500
−20 −15 −10 −5 0 5 10 15 20 −100
20 40 60 80 100 120 140 160
Lag
Time
Correlations
Training Targets Test Targets
Zero Correlation Training Outputs Test Outputs
Confidence Limit Validation Targets Errors
Validation Outputs Response
Figure 7: The network residual autocorrelation map when the
500
number of hidden neurons is 17.
Error
model is for purpose and then get the error graph shown in −500
20 40 60 80 100 120 140 160
Figure 8. Time
The final result is shown in Figure 9. Targets - Outputs
300 0
−50
250 −100
200 −150
−200
150 20 40 60 80 100 120 140 160
Time
100 Training Targets Test Targets
Training Outputs Test Outputs
50 Validation Targets Errors
0 5 10 15 20 25 Validation Outputs Response
Week 400
Actual value
200
Error
Prediction from NARNN
0
Figure 9: Actual values and predicted values from ECS-NARNN.
−200
20 40 60 80 100 120 140 160
Time
Targets - Outputs
Autocorrelation of Error 1
1600 Figure 11: Error of using NARNN to fit the residual of ARIMA.
1400
1200
1000
Correlation
800 500
600
450
400
200 400
0
−200 350
−400
Weekly sales
Figure 10: Error correlation diagram of using NARNN to fit the 150
residual of ARIMA.
100
50
0 5 10 15 20 25
Week
and prediction error of ECS-ARIMA, ECS-NARNN, and Actual value
ARIMA-NARNN combined model and then evaluate the Prediction from ARIMA-NARNN
predictive performance of each model. Figure 12: The prediction of using ARIMA-NARNN combined
The predicted and actual values of the three models are model and actual values.
compared as shown in Figure 13.
From Figure 13, the predicted value of the three models
in the test set fits well with the real value, and the prediction 4.4. Model Comparison and Discussion. In the 60 time series
performance is also good. The ARIMA-NARNN model has of different e-commerce products, the trend and season-
a higher fitting degree for predicted values and real values. ality of them are not exactly the same. For different time
However, the result is only for a single time series; in order series features, different analysis methods (i.e., ARIMA or
to verify the universal superiority of the ARIMA-NARNN SARIMA) will be used. The final analysis results are shown in
model, the same analysis process will be performed for the Figure 14.
other 59 time series of different e-commerce items sales. The It can be seen in Figure 14 that the RMSE of ARIMA-
types of items include food, beverages, household appliances, NARNN is generally lower than that of ARIMA and NARNN.
fresh food, household goods, baby products, toys, home In order to quantitatively compare the effects of the three
textiles, clothing, footwear, etc. models, average of the MRE and RMSE for 60 e-commerce
Mathematical Problems in Engineering 9
Table 2: Average of the MRE and RMSE for 60 e-commerce products of fitting and prediction.
500 40
450
35
400
350 30
Weekly sales
RMSE
300
250 25
200
20
150
100 15
0 10 20 30 40 50 60
50 E-commerce product id
0 5 10 15 20 25 ARIMA
Week NARNN
Actual value ARIMA-NARNN
Prediction from ARIMA
Prediction from NARNN Figure 14: RMSE of 60 time series of e-commerce products from
Prediction from ARIMA-NARNN the ARIMA, NARNN, and ARIMA-NARNN models.
Figure 13: Actual values and the predicted values from the three
models.
Box 1
Mathematical Problems in Engineering 11
International
Journal of
Mathematics and
Mathematical
Sciences
Journal of
Hindawi
Optimization
Hindawi
www.hindawi.com Volume 2018 www.hindawi.com Volume 2018
International Journal of
Engineering International Journal of
Mathematics
Hindawi
Analysis
Hindawi
www.hindawi.com Volume 2018 www.hindawi.com Volume 2018