This action might not be possible to undo. Are you sure you want to continue?
Submitted by, S.PRIYA M.Phil (Computer Science) Avinashilingam University for Women 30/07/2011
as well as many other Earth Observation System. for example. data is collected for the relevant predictors. and purchase history might predict the likelihood of a future sale. with units of electrons per square meter. This ionospheric characteristic constitutes an important parameter in trans-ionospheric links since it issued to derive the signal delay imposed by the ionosphere Total Electron Content (or TEC) is an important descriptive quantity for the ionosphere of the Earth. a statistical model is formulated.Predictive Models for Total Electron Content INTRODUCTION The ionosphere is defined as a region of the earth's upper atmosphere where sufficient ionisation can exist to affect the propagation of radio waves. Even descriptive models make predictions : a histogram predicts the value of the density of a distribution where a new observation happens to fall. TEC is the total number of electrons present along a path between two points. TEC is strongly affected by solar activity. where 1016 electrons/m² = 1 TEC unit (TECU). . space weather forecast. For example. a customer's gender. a regression model will be used for predicting the value of the response variable y for any new value of the predictor x. TEC is an important descriptive quantity for the ionosphere of the Earth. A predictive model is made up of a number of predictors. Making predictions about new incoming data. predictions are made and the model is validated (or revised) as additional data becomes available. which are variable factors that are likely to influence future behavior or results. Prediction of ionosphere total electron content (TEC) are crucial and remain as a challenge for GPS positioning and navigation system . In marketing. In predictive modeling. MAIN PURPOSES OF PREDICTIVE MODEL The discovery of valuable information about the probability distribution that generated the data. age.
70% of the records in X is taken as training data and the rest of the 30% is taken as testing data. where input and output are vectors equal in size to the number of network inputs and outputs. respectively. Data in the form of <input. output> pairs are extracted from X. the predictors has to be trained on an appropriate data series. . Stage 1: Preparation of input data into training and testing set Stage 2: Prediction of future values Stage 3: Comparison of the results with respect to efficiency in prediction In stage 1. given an input dataset (X) with ‘n’ VTEC values obtained over a period of time‘t’. a 70% and 30% division was adopted. namely. that is. training dataset and testing dataset.The Figure1. To make meaningful forecasts. In the present research work.Proposed System Design Select Winner The research methodology consists of three stages. the proposed methodology first groups X into two sets. shows the proposed system design for prediction of Vertical Total Electron Content. Input Data. Xn(t) Xn-1 (t) Xn-2 (t) … Xn-k (t) Neural Networks Xn-k+1 (t) Xn-k+2 (t) KNN-AC … Xn-k+m (t) KNN-AD LPC Performance Evaluation Figure 1.
The accuracy of the method is directly dependent on the ability to identify good neighbors. Prediction using Linear Predictive Coding DATASET USED The TEC data is obtained by using the dual frequency of GPS receivers installed for the satellite navigation project of ISRO called GAGAN. VTEC (Vertical Total Electron Content) and STEC (Slant Total Electron Content). longitude. latitude. Prediction using K Nearest Neighbor with Correlation method 2. From the dataset 70% are taken for training and 30% are taken for testing. Prediction using Neural Network Method 4. Prediction using K Nearest Neighbor with Absolute Distance method 3.PREDICTION METHODS 1.Defines where to start the forecasts.VTEC values are taken for predictions.Embedding dimension k-Number of nearest neighbors to use in the forecast's calculation Output Insamp_for_corr – Predicted VTEC values Procedure K Nearest Neighbor method is usually based on the identification of several historic Neighbors that may be then used for forecasting by either averaging their contribution or using an extrapolation method. Specify the column number to predict the VTEC values M . These TEC data recorded by the Agatti satellite on January 2008 with the elevation angle 60 . PREDICTION USING K NEAREST NEIGHBOR WITH CORRELATION METHOD Input X – Input Vetor for VTEC D . 0 TEC dataset used in this research work contains 1065 records with the columns IST time. .
The term m is also defined as the dimension of the time series. where t is the number of records t=m. The absolute correlation is calculated using the following equation (1). where k1 is the number of training samples and the testing vector is denoted as X m + k 2 . it is searched the k pieces with the highest value of |ρ|. where k2 is the number of predictions to be made. each for each observations.T. in a formal notation.Embedding dimension k-Number of nearest neighbors to use in the forecast's calculation Output .Defines where to start the forecasts. X T ) + | ρ | ( X im .For the method of correlation. YTm ) (1) PREDICTION USING K NEAREST NEIGHBOR WITH ABSOLUTE DISTANCE METHOD Input X – Input Vetor for VTEC D . Step 3: With the k1 data on hand. Specify the column number to predict the VTEC values M . it is necessary to understand in which way the k vectors can be used to construct the forecast on t+1. m | ρ | ( X im . The training vector VTEC is denoted as X1 m −k 1 . The absolute distance method simply verifies the observations ahead of the k chosen neighbors and takes the average of them. ….which represents the absolute (euclidian) correlation between Yi m and Yt m .Steps for KNN Correlation: Step 1: Define a starting training period and divide such period on different VTEC vectors represented as Xm t of size m . m Step 2: Select k1 observations that is most similar to the training set of VTEC Yt m .T is the number of observation on the training period.
(VTEC values) maxlag . where k1 is the number of training samples and the m +k testing vector is denoted as X m 2 .Learning rate. The training vector VTEC is denoted as X1 m −k 1 . where t is the number of records (1. The absolute distance method simply verifies the observations ahead of the k chosen neighbors and takes the average of them. Step 2: Select k1 observations that is most similar to the training set VTEC X1 m −k 1 Step 3: With the k1 data on hand. The steps 1-3 are executed in a loop until the point that all forecasts on t+1 are created. which controls the step size when weights are iteratively adjusted.. lr . HPF .Insamp_For_Abs – Predicted VTEC values Procedure The steps in the KNN Absoute Distance algorithm are presented below Step 1: Define a starting training period and divide such period on different VTEC vectors represented as Xm t of size m. each for each observations.number of periods that should be forecasted. . PREDICTION USING NEURAL NETWORK METHOD Input y .percent of observations for trainig set.. Learning rate define a control parameter of some training algorithms.maximume lag that should be entered in model.n-1).number of hidden layer units. it is necessary to understand in which way the k vectors can be used to construct the forecast on t+1.A time series in vertical vector form.. trset .(eg 5) nhiden . The term m is also defined as the dimension of the time series.. where k2 is the number of predictions to be made.
forecast values of VTEC.1 and 4. one epoch occurs. number of examples. from layer C to layer A). and error limit. 3. the output of the unit’s activation function is computed (Equations 4.2). using which. . For every connection.2. Training can be time-consuming. Compute the difference between the desired output. epochs limit. possibly until a predetermined maximum number of epochs (epochs limit) is reached or the network output error (error limit) falls below an acceptable threshold.minimum of root mean squares error. minRMSE . the network propagates values through all units to the output(s). yf . change the weight modifying that connection in proportion to the error.2. 1. Training usually lasts thousands of epochs. yL .Outputs RMSE -root mean squares error. In the second . Present an input vector VTEC to the network for training. Procedure The backpropagation training stage uses the training VTEC dataset from and train the network in of three steps. In the first step. when an given input vector VTEC is presented. depending on the network size. When these three steps have been performed for every input from the training data set. from layer A to layer C). 2. Compute activation functions sequentially forward from the first hidden layer to the output layer (Figure 4. and the actual network output (output of unit(s) in the output layer). for each layer starting with the first hidden layer and for each unit in that layer.matrix of y's lags. Propagate the error sequentially backward from the output layer to the first hidden layer (Figure 4.
the weights will have a different value. Δw c. and the error term in Equation 3 is computed. The goal of backpropagation training is to converge to a near-optimal solution based on the total squared error calculated (Equation 3).c is the weight h′ ( x) is Ok(1-Ok).e.. N δc = h′ δ W Hidden ∑ n n.p. For each unit in the output layer.step. and Op is the output of unit p or the network input p. δc = h′ (x)(D c − O c ) Output (2) Where Dc is the desired network output (from the output vector) corresponding to the current output layer unit. 1. Hidden modifying the connection from unit c to unit n. (3) Where N is the number of units in the next layer (either hidden or output layer). which presents the change in the weight modifying the connection from unit p to unit c. the error term in (Equation 2) is computed. is computed and added to the weight. For each unit in the hidden layers. an error term is computed. the value of Equation 4. after step three. each unit of the output layer. Oc is the actual network output corresponding to the current output layer unit. and wn. p = αδ c O p (4) The weight modifying the connection from unit p to unit c is wc. The derivative of the hidden unit sigmoid activation function In the third step. α is the learning rate (discussed later). Thus. δ n is the error term for a unit in the next layer.c n =1 h′ Output (x) is the derivative of the output unit linear activation function. for each layer. for each connection. . i.
Dc is the desired output corresponding to the current output layer unit. Input X . PREDICTION USING LINEAR PREDICTIVE CODING LPC uses linear prediction to extrapolate data.a string 'pre' or 'post' (default: post) . Output y-the output.the input VTEC data series as a column vector or a matrix with series organized in columns np .. A window of autocorrelation coefficients is moved beyond the data limits to extrapolate the data.The number of data values to return in the output pos .a(np)*y(k-np) Where y(n) => x(end-n) for n<=0 a .the number of predictor coefficients to use (>=2) npred . . These can be used to check the quality/stability of the fit to the observed data to the LPC function. and Oc is the actual network output corresponding to the current output layer unit.a(3)*y(k-2) ..This determines whether extrapolation occurs before or after the observed series x. appropriately sequenced for concatenation with input x. typically a time series.Output y is calculated using the formula y(k) = -a(2)*y(k-1) . this is not the same as linear extrapolation. Procedure ..the coefficients returned by LPC (organized in rows).1 c E c = ∑ (Dc − O c ) 2 2 c= 1 (5) Where C is the number of units in the output layer.
…. x(n-2). .Given x(n-1). … n-M) = ∑a k =1 M k x( n − k ) (7) Using the equation 7 VTEC values are predicted. n-2. x(n-M). where ak is the constant coefficients. n-2. X(n | n-1. x(n-2). this predicted VTEC value can be expressed as a linear function of the given M past samples (Equation 6) . The above equation can be rewritten as a M-dimensional vector (Equation 7) X(n | n-1. then it is said to be predicted linearly. n-M) (8) Where M is the number of past samples that are used to predict the next VTEC values. … n-M) = ψ (x(n-1). The prediction error rate for VTEC is defined as fM(n) = x(n) – X(n | n-1. n-2. In LPC. … x(n-m)) (6) When a value is predicted using the above equation. the problem here is to predict the value of VTEC denoted as x(n). ….
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue listening from where you left off, or restart the preview.