Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more ➡
Download
Standard view
Full view
of .
Add note
Save to My Library
Sync to mobile
Look up keyword
Like this
11Activity
×
0 of .
Results for:
No results containing your search query
P. 1
Applicability of Data Mining Techniques for Climate Prediction – A Survey Approach

Applicability of Data Mining Techniques for Climate Prediction – A Survey Approach

Ratings: (0)|Views: 2,870|Likes:
Published by ijcsis
British mathematician Lewis Fry Richardson first proposed numerical weather prediction in 1922. Richardson attempted to perform many kinds of low complexity numerical forecasts before World War II. The first successful numerical prediction was performed in 1950 by a team composed of American metrologists Jule Charney, Philip Thompson, Larry Gates, and Ragnar using the ENIAC digital computer. Climate prediction is a challenging task for researchers and has drawn a lot of research interest in the recent years. Many government and private agencies are working to predict the climate. In recent years, more intelligent weather forecast based on Artificial Neural Network (ANNs) has been developed. Two major Knowledge Discovery areas are (a) data analysis and mining, which extracts patterns from massive volumes of climate related observations and model outputs and (b) data-guided modeling and simulation (e.g., models of water and energy or other assessments of impacts) which take downscaled outputs as the inputs. In this survey we present some of the most used data mining techniques for climate prediction. But still it is a challenging task. In this paper, we survey various climate prediction techniques and methodologies. End of this survey we provide recommendations for future research directions.
British mathematician Lewis Fry Richardson first proposed numerical weather prediction in 1922. Richardson attempted to perform many kinds of low complexity numerical forecasts before World War II. The first successful numerical prediction was performed in 1950 by a team composed of American metrologists Jule Charney, Philip Thompson, Larry Gates, and Ragnar using the ENIAC digital computer. Climate prediction is a challenging task for researchers and has drawn a lot of research interest in the recent years. Many government and private agencies are working to predict the climate. In recent years, more intelligent weather forecast based on Artificial Neural Network (ANNs) has been developed. Two major Knowledge Discovery areas are (a) data analysis and mining, which extracts patterns from massive volumes of climate related observations and model outputs and (b) data-guided modeling and simulation (e.g., models of water and energy or other assessments of impacts) which take downscaled outputs as the inputs. In this survey we present some of the most used data mining techniques for climate prediction. But still it is a challenging task. In this paper, we survey various climate prediction techniques and methodologies. End of this survey we provide recommendations for future research directions.

More info:

Published by: ijcsis on Jun 30, 2010
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See More
See less

02/22/2013

pdf

text

original

 
Applicability of Data Mining Techniques forClimate Prediction – A Survey Approach
Dr. S. Santhosh BabooReader, PG and Research department of ComputerScience,Dwaraka Doss Goverdhan Doss Vaishnav CollegeChennaisanthos2001@sify.comI. Kadar Shereef Head, Department of Computer ApplicationsSree Saraswathi Thyagaraja CollegePollachikadarshereef@gmail.com
 Abstract 
British mathematician Lewis Fry Richardsonfirst proposed numerical weather prediction in 1922.Richardson attempted to perform many kinds of lowcomplexity numerical forecasts before World War II. The firstsuccessful numerical prediction was performed in 1950 by ateam composed of American metrologists Jule Charney, PhilipThompson, Larry Gates, and Ragnar using the ENIAC digitalcomputer. Climate prediction is a challenging task forresearchers and has drawn a lot of research interest in therecent years. Many government and private agencies areworking to predict the climate. In recent years, more intelligentweather forecast based on Artificial Neural Network (ANNs)has been developed. Two major Knowledge Discovery areasare (a) data analysis and mining, which extracts patterns frommassive volumes of climate related observations and modeloutputs and (b) data-guided modeling and simulation (e.g.,models of water and energy or other assessments of impacts)which take downscaled outputs as the inputs. In this survey wepresent some of the most used data mining techniques forclimate prediction. But still it is a challenging task. In thispaper, we survey various climate prediction techniques andmethodologies. End of this survey we providerecommendations for future research directions.
 Keywords
Weather Forecasting, Climate Prediction,Temperature Control, Neural Network, Fuzzy Techniques,Knowledge Discovery, Machine Learning, Data Mining.
 
I.
 
I
NTRODUCTION
 Data mining is the process of extracting important anduseful information from large data sets [1]. In this survey,we focus our attention on application of data miningtechniques in weather prediction. Now a day’s weatherprediction is an emerging research field. This work providesa brief overview of data mining techniques applied toweather prediction.Data mining techniques provides with a level of confidence about the predicted solutions in terms of theconsistency of prediction and in terms of the frequency of correct predictions. Some of the data mining techniquesinclude: Statistics, Machine Learning, Decision Trees,Hidden Markov Models, Artificial Neural Networks, andGenetic Algorithms. Basically data mining techniques canbe classified as such as frequent-pattern mining,classification, clustering, and constraint-based mining [2].Classification techniques are designed for classifyingunknown samples using information provided by a set of classified samples. This set is usually referred to as atraining set, because in general it is used to train theclassification technique how to perform its classification.Neural networks and Support Vector Machines techniqueslearn from a training set how to classify unknown samples.In other words samples whose classification is unknown.The K- nearest neighbor classification technique does nothave any learning phase, because it uses the training setevery time a classification must be performed. Due to thisproblem, K- nearest neighbor is referred to as a lazyclassifies.A major generic dispute in climate data mining resultsfrom the nature of historical observations. In recent years,climate model outputs and remote or in situ sensorobservations have grown rapidly. However, for climate andgeophysics, historical data may still be noisy andincomplete, with uncertainty and incompleteness typicallyincreasing deeper into the past. Therefore, in climate datamining the need to develop scalable solutions for massivegeographical data co-exist with the need to developsolutions for noisy and incomplete data [3].The remainder of the paper is organized as follows. InSection 2 we present the related work for solving Climateprediction used data mining techniques. Section 3 provides amarginal explanation for future enhancement. Section 4concludes the paper with fewer discussions.II.
 
R
ELATED
W
ORK
 Data mining and their applications have been utilized indifferent research areas and there is a bloom in this field.Different techniques have been applied for mining data overthe years. Qiang yang and Xindong wu [4] discussed the tenimportant challenging problems in data mining researcharea. Most used ten data mining techniques are discussed ina paper [4].Ganguly et al. in [3] explained the necessity of datamining for climate changes and its impacts. Knowledgediscovery from temporal, spatial and spatiotemporal data isdecisive for climate change science and climate impacts.Climate statistics is an established area. Nevertheless, recentgrowth in observations and model outputs, combined withthe increased availability of geographical data, presents newopportunities for data miners. Their paper maps climaterequirements to solutions available in temporal, spatial andspatiotemporal data mining. The challenges result fromlong-range, long-memory and possibly nonlineardependence, nonlinear dynamical behavior, presence of thresholds, importance of extreme events or extremeregional stresses caused by global climate change,uncertainty quantification, and the interaction of climatechange with the natural and built environments. Their papermakes a case for the development of novel algorithms to
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 1, April 2010203http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
address these issues, discussed the recent literature, andproposed new directions. An illustrative case studypresented here suggests that even relatively simple datamining a proaches can provide new scientific insights withhigh societal impacts.Shyi-ming Chen and Jeng-ren Hwang together in [5]proposed a new fuzzy time series model called the twofactors time – variant fuzzy time series model to deal withforecasting problems. In this proposed model, theydeveloped two algorithms for temperature prediction. Theauthor presented a one – factor time variant fuzzy timeseries model and proposed an algorithm called Algorithm-A,that handling the forecasting problems. However, in the realworld, an event can be affected by many factors. Forexample , the temperature can be affected by the wind , thesun shine duration, the cloud density, the atmosphericpressure,…etc., if we only one use one factor of them toforecast the temperature, the forecasting results may lack accuracy. The author can get better forecasting results if weconsider more factors for temperature prediction. In [6-9],the researchers only use the one-factor fuzzy time seriesmodel to 6deal with the forecasting problems. The authorproposed a new forecasting model which is a two - factorstime –variant fuzzy time series model. He developed twoalgorithms which use two factors (ie. the daily averagetemperature and the daily cloud density) for temperatureprediction. Author concluded that the forecasting results of Algorithm B* are better than the forecasting results of Algorithm-A and Algorithm-B.Acosta and Gerardo [10], presented an artificial neuralnetwork (ANN), implemented in a Field ProgrammableGate Array (FPGA) was developed for climate variablesprediction in a bounded environment. Thus, the new ANNmakes a climate forecast for a main (knowledge based)system, devoted to the supervision & control of thegreenhouse. The main problem to solve in weatherforecasting is to achieve the ability of prediction of timeseries. The ANN approach seems attractive in this task fromseveral points of view [11], [12]. He utilized there arevarious ANN architectures, capable to learn the evaluativefeatures of temporal series, and to predict future states of these series from past and present information. He achievedthe best system for a simple, low cost and flexiblearchitecture of the ANN using the Field Programmable GateArrays (FPGA) technology.Nikhil R. Pal and Srimanta Pal in [13] mentioned theeffectiveness of multilayer perceptron networks (MLPs) forprediction of maximum and the minimum temperaturesbased on past observations on various atmosphericparameters. To capture the seasonality of atmospheric data,with a view to improving the prediction accuracy, authorthen proposed a novel neural architecture that combines aSelf Organizing Feature Map (SOFM) and MLP’s to realizea hybrid network named SOFM-MLP with betterperformance. They also demonstrate that the use of appropriate features such as temperature gradient cannotonly reduce the number of features drastically, but also canimprove the prediction accuracy. Based on theseobservations they used a Feature Selection MLP (FSMLP)instead of MLP. They used a combined FSMLP and SOFM-MLP results in a network system that used only very fewinputs but can produce good prediction.LAI.L.L. in [14] described a new methodology to shortterm temperature and rainfall forecasting over the east coastof claim based on some necessary data preprocessingtechnique and the Dynamic Weighted Time- Delay NeuralNetworks (DWTDNN), in which each neuron in the inputlayer is scaled by a weighting function that captures thetemporal dynamics of the biological task. This network is asimplified version of the focused gamma network and anextension of TDNN as it incorporates Apriori Knowledgeavailable about the task into the network architecture. Basedon this architecture the forecast prediction result isapproximately evaluated.Satyendra Nath Mandal in [15] presents generally softcomputing model was composed of fuzzy logic, neuralnetwork, genetic algorithms etc., Most of the time , these 3components are combine in different ways to form model,such as Fuzzy – Neuro Model, Neuro-genetic Algorithmmodel, Fuzzy – neuro- GA model etc., All this combinationis widely used in prediction of time series data. But theauthor proposed models of soft computing using neuralnetwork based on fuzzy input and genetic algorithm havebeen tested on same data and based on error analysis (calculation of average error ) a suitable model is predictedfor climate prediction.Aravind sharma in [16] proposed a new technique iscalled Adaptive Forecasting Model. They represent a newapproach where the data explanation is performed with softcomputing technique. It is used to predict metrologicalposition on the basis of measurements by a weather systemdesigned. This model helped in making forecast of differentweather conditions like rain and thunderstorm, sunshine anddry day and perhaps a cloudy weather system. (i.e.) purposeof this model is to represent a warning System for likelyadverse conditions using sensors. He used at data recordingat 4 samples per second [17] was adequate to see minute’schanges in atmospheric pressure and temperature trends.Perhaps sampling at every minute interval might have beenall right as atmospheric conditions do not change very fast.At some places in bad weather, atmospheric conditionsperhaps can change faster; hence, the instrument used fordata recording did not miss any such signature and find noabrupt changes.S. Kotsiantis in [18] investigated the efficiency of datamining techniques in estimated minimum, maximum andmeans temperature values. To achieve, they conductednumber of experiments with well-known regressionalgorithms using real temperature data of the city.Algorithms performance has been evaluated using standardstatistical indicators, such as correlation co-efficient, Rootmean squared error, etc., using this approach they found thatthe regression algorithms could enable experts to predictminimum, maximum and average temperature values withsatisfying accuracy using as input the temperatures of theprevious years.Y. Radhika and M. Shashi in [19] proposed an applicationof Support Vector Machine (SVM) for weather prediction.Time series data of daily maximum temperature at a location
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 1, April 2010204http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
is analyzed to predict the maximum of the next day at thatlocation based on the daily maximum temperatures for aspan of previous ‘n ‘ days referred to as order of the input.The performance of SVM was compared with MLP fordifferent orders. The results obtained show that SVMperforms better than MLP trained with back propagationalgorithm for all orders. It was also observed that parameterselection in the case of SVM has a significant effect on theperformance of the model.Yufu Zhang in [20] presented a statistical methodologyfor predicting the actual temperature for a given sensorreading. Author present two techniques: Single sensorprediction and multi-sensor prediction. The experimentalresults indicate that their methods can significantly increasethe estimation accuracy of sensors temperature by up to 67%as compared to ignoring the error in sensor readings. Theauthor also found that exploiting, the correlation of differentsensors results in better thermal estimates than ignoringthem and estimating each sensor temperature individually.Both single sensor case and multi-sensor case areinvestigated with different strategies of exploiting thecorrelation information. Optimal and heuristic estimationschemes are proposed to address the problem when theunderlying nature of the sensor noise is Gaussian and Non-Gaussian.Ivan Simeonov in [21] explained the algorithmicrealization of system for short-term weather forecasting,which makes acquisition, processing and visualization of information, related to the parameters temperature,atmospheric pressure, humidity, wind speed and direction.Some of the weather forecasting methods are 1) Persistencemethod, 2) Trends method, 3) Climatology method, 4)Analog method and 5) Numerical weather predictionmethod [22]. Based on the above methods, the authorcreates a new system for short term weather forecasting. Thecreation of the algorithm for short-term weather forecasting,based on the common and special features of knownmethods for weather forecasting and some surface featuresto the earth ground level.A system to predict the climate change was developed byZahoor et al. in [23]. The impact of seasonal to inter-annualclimate prediction on society, business, agriculture andalmost all aspects of human life, force the scientist to giveproper attention to the matter. The last few years showtremendous achievements in this field. All systems andtechniques developed so far use the Sea SurfaceTemperature (SST) as the main factor, among other seasonalclimatic attributes. Statistical and mathematical models arethen used for further climate predictions. In their paper, theydeveloped a system that uses the historical weather data of aregion (rain, wind speed, dew point, temperature, etc.), andapply the data-mining algorithm “K-Nearest Neighbor(KNN)” for classification of these historical data into aspecific time span. The k nearest time spans (k nearestneighbors) is then taken to predict the weather. Theirexperiments show that the system generates accurate resultswithin reasonable time for months in advance.Wang et al. in [24] put forth a technique for predicting theclimate change using Support Vector Machine (SVM). Theclimate model is the critical factor for agriculture. However,the climate variables, which were strongly corrupted bynoises or fluctuations, are complicated process and can notbe reconstructed by a common method. In their paper, theyadapted the SVM to predict it. Specifically, theyincorporated the initial condition on climate variables to thetraining of SVM. The numerical results show theeffectiveness and efficiency of the approach. The techniqueproposed in [24] was effective in predicting the variations inthe climate using the initial conditions.Shikoun et al. in [25] described an approach for climatechange prediction using artificial neural networks. Greatdevelopment has been made in the effort to understand andpredict El Nino, the uncharacteristic warming of the seasurface temperature (SST) along the equator off the coast of South America which has a tough collision on the climatechange over the world. Advances in enhanced climatepredictions will result in considerably enhanced economicopportunities, predominantly for the national agriculture,fishing, forestry and energy sectors, as well as socialbenefits. Their paper presents monthly El Nino phenomenaprediction using artificial neural networks (ANN). Theprocedure addresses the preprocessing of input data, thedefinition of model architecture and the strategy of thelearning process. The principal result of their paper isfinding out the best model architecture for long termprediction of climate change. Also, an error model has beendeveloped to improve the results.III.
 
F
UTURE
D
IRECTIONS
 Weather plays an important role in many areas such asagriculture. In a near future, more sophisticated techniquescan be tailored to address complex problems in climateprediction and hence provide better results. In this study wefound that neural network based algorithms are performancewell comparatively other techniques. To improve theperformance of the neural network algorithms otherstatistical based feature selection techniques can beincorporated. In the other direction fuzzy techniques have tobe incorporated.IV.
 
C
ONCLUSION
 In this section some of the main conclusions andcontributions of the work are summarized. In conclusion, itis our opinion there is a lot of work to be done on thisemerging and interesting research field. In recent years,more intelligent weather forecast based on Artificial NeuralNetwork (ANNs) has been developed. This paper survey themethodologies used in the past decade of years for climateprediction. In particular, this survey presents some of themost extensively used data mining techniques for climateprediction. Data mining techniques provides with a level of confidence about the predicted solutions in terms of theconsistency of prediction and in terms of the frequency of correct predictions. In the study we found that neuralnetwork based algorithms can provide better performancecomparatively than other techniques. Furthermore, in orderto improve the presentation of the neural network algorithmsother statistical based feature selection techniques can beintegrated. In the other direction fuzzy techniques can beincorporated to achieve better predictability.
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 1, April 2010205http://sites.google.com/site/ijcsis/ISSN 1947-5500

Activity (11)

You've already reviewed this. Edit your review.
1 hundred reads
1 thousand reads
sasivenkatesh added this note|
Really very Useful for researchers. Excellent Survey Work.
Azila Zulkifli liked this
Azila Zulkifli liked this
sakkk liked this
coolgreeny liked this

You're Reading a Free Preview

Download
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->