You are on page 1of 4

2013 Third International Conference on Intelligent System Design and Engineering Applications

The Research on Stock Price Forecast Model Based on Data Mining of BP Neural
Networks

Wu Ming-tao,Yang Yong
Economic and Management College Northeast Petroleum University, Daqing, Heilongjiang, 163318, China
dqwumingtao@126.com

Abstract—The model of stock price forecast based on data hidden layer and input layer has effect on the input layer
mining of BP neural networks is put forward in this article. On which from the upper layer, and bring corresponding output.
the basis of an integrated data mining process of selection of
data samples, data conversion, network modeling, network
simulation, and evaluation of results, the prediction about the
trend of SSE(Shanghai Stock Exchange) Composite Index
provides a higher accuracy. It indicates that the use of data
mining of BP neural network in the forecast of non-linear
system has advantages, so will it provide a new idea for the
forecast of non-linear system.

Keywords-Neural networks; BP algorithm; Data mining; Figure 1. The topological structure of BP neural network
Stock price forecast
B. Process of Data Mining
I. INTRODUCTION We ensure the general process, based on the data mining
Stock market forecast is a branch of economy forecast. It technology of BP neural network, by referring to the 5A
is the forecast of prospect of stock market’s future model of corporation SPSS and SEMMA model of research
development, based on accurate investigation of statistical institution SAS, and combining with of data mining
data and stock market information, starting with stock technology of neural network to its own features. See in
market history, status quo and regularity, and applying figure 2.
scientific methods. At present, the most frequently used
forecast method is developed through some linear methods,
like moving average and regression analysis. Although
researchers have done much work in the application of
neural networks in stock price forecast, the present research
is mainly focused on the construction and optimizing of
neural network model and the settlement of technical
aspects, but not the discussion on the solution of problems in
this kind.
In this article, based on the process of data mining and
regarding the neural network model as key technology, the 
trend of SSE(Shanghai Stock Exchange) Composite Index is Figure 2. The data mining process based on BP neural network
quantitatively forecasted by empirical analysis.
II. DATA MINING PROCESS BASED ON NEURAL III. SAMPLE DATA
NETWORK
A. Selection of Sample Data
A. BP Neural Networks Model In terms of stock market, the selection of sample data has
Artificial neural network is a newly emerging Inter- two principles to follow: the first is to choose samples that
discipline subject, which is developed as the non-linear accord with trade rules and have distinct trade characteristics;
information processing system based on the imitation of the the second is to take the performance of neural network
intelligent structure and function of human brains. The model into account. So this test selects continuous
topological structure of BP neural network is shown in SSE(Shanghai Stock Exchange) composite index of 100
figure 1. There are connections among neurons from input days, from Nov 30th,2009 to Apr 29th ,2010 as sample data
layer to hidden layer, and hidden layer to output layer, and to be mined and divide them into training sample and test
the strength of the connections is determined by sample according to the actual need.
corresponding weight. The activation function of neurons of

978-0-7695-4923-1/12 $26.00 © 2012 IEEE 1526


DOI 10.1109/ISDEA.2012.366
B. Confirmation of Sample Vector appropriate activation function should be chosen as required
If the selection of sample vector is indiscriminate, the and the frequently-used function is Sigmoid. Due to the
data would be numerous and jumbled, the system load would nonlinearity of activation function f(x), the multilayer
be increased, and the function of network would be lowered. feedforward networks trained by BP algorithm build a high
On the contrary, if the selected targets are few, it would be nonlinear mapping from input to output and it also can
impossible to describe the characteristics of stock market. So express complex objective phenomenon. Furthermore,
the quantitative targets which can reflect the trade because its derivate can be represented by f(x) itself, in the
characteristics of stock market should be chosen. process of error back propagation, there is no need to count
Through the research of the targets of SSE(Shanghai the derivate of energy function, thus can reduce the
Stock Exchange) composite index, eight sample vectors are computation amount and improve the efficiency of network.
ensured. See in table 1. x1~x7 are input vector, x8 is output But it also has saturability at the same time, which will lead
vector. the convergence speed of BP algorithm to be slow.
Table 1 vector of BP neural network model Through the study and training of sample data of stock
market, three common Sigmoid functions are considered as
Sample Sample follows:
implication implication
vector vector
2
x1
Today’s opening
x5
Today’s transaction f1 (x ) = −1 (2)
price volume 1 + e−x
f 2 ( x ) = tanh x
Today’s reserve Today’s price
x2 x6 (3)
price quantity
Today’s ceiling 2
f 3 ( x ) = arctan( x )
x3 x7 Today’s price
price (4)
x4 Today’s turnover x8
Today’s closing π
price Table2 lists several kinds of the performance of
activation function. From this, it can be concluded that if f2(x)
IV. DATA CONVERSION is applied as the activation function of hidden layer and
output layer, the convergent pace of network would be the
A. Normalization of Data fastest.
Table2 comparison experiment of activation function
In the process of data mining, the original sample data Activation Times of circulation
MSE VCR
must be normalized. There are two reasons: the first is that function iteration
each input component of network has different meaning and f1 2500 27.750 80.6
measurement, and the normalization can entitle each input f2 2500 11.550 93.4
component with equal status; the second is that the neuron of f3 2500 15.865 90.1
f1 4800 19.997 90.3
BP network frequently use the Sigmoid activation function,
f2 950 19.987 89.1
and its output value is generally between 0 and 1 or -1 and 1, f3 2100 20.000 89.0
so by normalization, the situation, that the absolute value of
input data is too big that can make the neuron saturate, can V. NETWORK MODELING
be prevented , then the adjusting of weight value would go
wrong.. According to the selected SSE(Shanghai Stock A. Modeling
Exchange) composite index, its value is big and it would be According to the working principle of BP model and the
very easy for BP model to be paralyzed. So appropriate data method of software engineering, using VC ++ as
conversion must be done and sample data should be development tool, the neural network tool is accomplished,
normalized. The conversion relation is like this: which provide the whole process of data mining with
critical modeling preparation.
xi − min x
xi' = (1) B. Optimization of Model
max x − min x Based on quantitative comparison analysis and mass data
test to optimize the model structure, some critical network
xi is the original value of samples. minx and maxx are parameters are set, such as learning step, the number of
minimum and maximum respectively. x'i is the samples’ hidden nodes and activation function. Through all these the
value after normalization. After normalization, all optimization of network model is finished and the network
components are constrained within 0 and 1, which meet the model that suitable to the stock price forecast is confirmed.
need of network training and network test of the next step. The results of model optimizing can be summarized as
B. Conformation of Activation Function follows.
The role of activation function f(x) is to activate neurons 1) the topological structure of network is 7(x1~x7) -9-
and make the neurons respond to input. Obviously, in order 1(x8);
to apply gradient descent into weight study, activation 2) learning step adopts variable step algorithms;
function must be differentiable. In the practical application, 3) moment index¢=0.01, the initial field of weight and
threshold value is ( -0.05,0. 05);

1527
4) the activation function is Sigmoid().
VI. NETWORK SIMULATION
By using the optimized BP network model, 90
continuous business days are chosen from selected data
samples as training samples, from Nov 30th,2009 to Apr
15th,2010 and then to predict the closing price of ten
business day from 18th Apr,2010 to 29th Apr.
A. Network Training
By using the optimized BP network model, training data
Figure 3. Time series chart of closing price index
samples are imported, then to train the network and keep the
trained network structure and training results. The results of
analysis of network training are as follows: maximum B. Limitation
absolute error is 17.01; minimum absolute error is 0.03; In order to analyze the limitation of the model,
maximum proportional error is 1.366%; minimum histogram of prediction error is made as figure 4. It can be
proportional error is 0.005%. From what has mentioned clearly seen in the chart that as time goes on, the absolute
above, the training effect of network is preferable and the error and proportional error between prediction value and
trained network structure can be used to carry on true value present a tendency of rise overall, which shows
forecasting of next step. that the model doesn’t suit the prediction of medium-and
B. Network Testing long-term tendency. Combining the features of BP model
with the major factors that influence the tendency of stock
The closing price index can be forecasted by importing price, the reasons that the technology of data mining of
trained network and test samples into neural network neural network is of noneffective to medium-and long-term
simulation tool. In order to analyze the prediction accuracy prediction can be concluded as follows:
of BP model, the prediction value and true value of network 1) In this trial, some critical factors that have influences
model are imported into SPSS software to make time series on stock market are not considered into the selection of
chart. See in figure 3. input vectors, such as economic sentiment index, external
From figure 3, it can be known that the prediction value market environment and emergency incidents. Obviously it
of network model is same as the actual value of closing will to some extent impact the prediction accuracy of the
price, which indicates that the technology based on data model.
mining of neural networks is feasible and practical in stock 2) The samples may contain some random samples that
price prediction. don’t fit with the laws of economic. It will make the
VII. RESULTS EVALUATION network model collect some particular features and will do
nothing to efficient promotion to general situations.
At the last phase of data mining, knowledge of 3) Although preferable learning accuracy can be
professional field should be applied to the analysis and obtained when the network model converge to its extreme
evaluation to the results of data mining, so as to make it value, it will cause the overfitting of sample space and lead
meet the need of practical use. to a bad effect of the promotion of the prediction samples.
A. Predictive Effect
It can be known from figure 3 that the prediction value
of model keeps in accordance with the change of actual
value, which indicates that in short time the model
established in this test can imitate the short-time tendency
of stock market. By further analysis and prediction of
accuracy, the short-time prediction accuracy of the model
can reach a rate of over 94%. Therefore, under the
circumstance of no significant change of macro-economic,
this prediction model will provide the investors with Figure 4. Histogram of prediction error
preferable basis for decision making.
In addition, the fierce fluctuation of China’s stock
market and the combination of various kinds of factors
make the change of long-term stock index uncertainly. In
terms of improving the prediction accuracy, the further
optimization of BP network model can be considered and

1528
then higher practicability of it can be achieved through the
combination with other technology. This is also the major
research direction at present.

VIII. CONCLUSIONS
This article applies the data mining technology of neural
network to stock price forecast and receives a preferable
result, which will provide the research of the stock market
development a new thought.

Note: This article is funded the Higher school of


humanities and social science research base project of
Heilongjiang province (JD20121212)
REFERENCES
[1] Philip M Tsang. Design and Implementation of N N 5 for Hong Kong
Stock Price Forecasting[R],2007,453.
[2] Feng Jia-cheng,Ma Rui Stock price forecast based on data mining of
neural networks. [J] Journal of Computer Applications. 2009,26: 155-
156.
[3] Chang B G.Observator with fuzzy neural network for monitoring the
operation-state of transformer[J].Acta Scientiarum Naturalium
Univercity Pekinensis,2003,39:78-81.
[4] Johnson M A.The random walk and beyond:an inside guide to stock
market[M].New York:John Wiley & Sons,1988.
[5] Rast M.Forecasting financial time series with fuzzy neural
network[J].IEEE,1997,1(Oct):28~31.

1529

You might also like