COMPARING MULTIVARIATE REGRESSION AND ARTIFICIAL NEURAL NETWORKS TO MODEL SOLAR STILL PRODUCTION

Noe Santos, E.I. University of Nevada, Las Vegas 4505 Maryland Pkwy, Box 454015 Las Vegas, NV 89154-4015 e-mail: Noe.Santos87@gmail.com Aly Said, PhD University of Nevada, Las Vegas 4505 Maryland Pkwy, Box 454015 Las Vegas, NV 89154-4015 e-mail: Aly.Said@unlv.edu

Dave James, PhD, P.E. University of Nevada, Las Vegas 4505 Maryland Pkwy, Box 451099 Las Vegas, NV 89154-1099 e-mail: Dave.James@unlv.edu Nanda Venkatesh, M.S. University of Nevada, Las Vegas 4505 Maryland Pkwy, Box 454015 Las Vegas, NV 89154-4015 e-mail: Nandavenkatesh@gmail.com

ABSTRACT A study has been performed to predict solar still performance using data originally gathered between February 2006 and August 2007. The purpose of this study was to determine the viability of modeling distillate production using local weather data with artificial neural networks (ANNs) and multivariate regression (MVR). This study used weather variables which were hypothesized to affect still performance. Insolation, wind velocity, wind direction, cloud cover, and ambient temperature were the main weather variables that were used as the input data along with the operating distilland volume. The objectives of this study were to determine the minimum amount of inputs required to accurately model solar still performance and to examine which type of model performed the best. 1. INTRODUCTION

water into drinkable water and can be used to reduce fossil fuel dependence that presently exists at distillation plants. Being able to predict solar still performance from long-term solar irradiance, air temperature, wind speed, wind direction, and cloud cover data, while taking into account meteorological variations would facilitate the sizing of solar still installations to meet planned water demands. Solar stills are a low cost alternative to energy intensive water purification methods (1). Due to relatively low energy fluxes from the sun, solar stills require a relatively high amount of space compared to other technologies. With the current state of the technology producing one to seven liters per square meter of still area per day, a community requiring 200 m3/day of drinking water (2) would require 3 to 20 hectares of still area, respectively. In order to meet the daily demands of a rural community, knowledge about the performance of distillation technology must be known for the local environment throughout the year. In order to meet the demand for accurate production modeling, Artificial Neural Networks (ANNs) can produce results that are not easily obtained with classical modeling techniques. This study will examine the effectiveness of ANNs with regards to incorporating weather data to predict solar still production.

With the rising cost and limited supply of traditional fossil fuels, both water transportation costs and distillation processes such as multistage flash, multiple effect, vapor compression, reverse osmosis, electrolysis, phase change, and solvent extraction will see their price per unit of water increase drastically. Solar distillation is a simple and clean technology which can be used to distill brackish or polluted

and derived heat and mass transfer coefficients. viscosity. and the distillate production. the amount of data processing and required equipment would make running such models out of reach for rural communities. which uses multiple thermal measurements recorded at sub hourly intervals. and computer simulation (10). numerical methods (9). the use of black dye and charcoal in the distilland (5). Most of the present research for solar stills has studied the effects of modifying solar still design or introducing new components to increase the evaporation rate of the distilland. After the original ANN is designed.2. The MLP network consists of input. vapor temperature. ARTIFICIAL NEURAL NETWORKS Multi-layer Perceptron (MLP) networks in ANNs have been used in the past for engineering applications due to their ability to use non-linear transformations and to learn patterns of behavior between inputs and outputs (13).C. 1: Basic solar still design concept (15) . Some of the modifications include using internal/external condensers (3-4). BACKGROUND 2. latent heat of vaporization. thermal conductivity. thermal conductivity of the still’s walls and glass cover.1. One of the advantages to using ANNs over multivariate regression is the ability of ANNs to take into account the total interaction between the input variables (14). specific heat. ambient air velocity. Figure 1 illustrates an example of a basic solar still model. all of these techniques rely on internal heat and mass transfer models (HMT) which were first published by Dunkle (11) in the 1960s and revisited by other researchers such as Tiwari and Tiwari (12). Reflected Rays Radiation Cover Glass Evaporation Raw Water Condensation Distillate Collection  Trough  Fig. The effectiveness of the networks is highly dependent on the size of the input and target data set size. The training process results in the minimization of errors between the actual and the predicted target variables. the comparison of the ANN model with a simple multivariate regression model (MVR) will be performed to determine each method’s strengths and weakness. ambient air temperature. total and diffused radiation. The coefficients are derived by processing large amounts of real-time data of various solar still thermal characteristics such as the outer/inner glass temperature. Sun Solar still research has been documented as having its roots as far back as the fourth century B. The power of generalization allows for the prediction of reasonable outputs given inputs that were not originally included in the training data (13). Because of the large amount of real time data required to evaluate the HMT models. Past research studies to accurately model solar still performance includes iteration methods(8). and multiple hidden layers with each layer having many hidden nodes (neurons). The HMT models (11-12) rely on a multitude of variables such as water density. there are units which can be partially or fully connected to units in consecutive layers. partial saturated vapor pressure. The output from the original layer is transferred to consecutive layers until reaching the final output layer which represents the complete response of the ANN to the input data’s patterns and trends. Despite the various analysis techniques. This study aims to replace traditional HMT modeling. the ability to forecast distillate production is limited by the ability to record the data and convert the information into heat and mass transfer coefficients. when Aristotle described a method to evaporate and condense polluted water for potable use (2). output. distilland temperature. Within each layer. ANNs derive their predictive power because of their parallel structure and their ability to learn and generalize. While the HMT models may have been successfully utilized in past studies. and condensing cover cooling techniques (7). Furthermore. The models that are created rely entirely on the weather data and distilland volume as the input data and the total daily distillate production as the target variable. multi-wick solar stills (6). internal solar still humidity. using black walls with cotton cloth (1). the ANN is trained in order to optimize the assigned weight for each connection until no change in the weights is detected. with ANNs that rely on easily obtainable weather data. The weights for each connection are at first random and represent the predicted relationship due to patterns. The hidden layers’ neurons connect the input and target layers by using a specified training function (13).

180 degrees from north. The experimental site was located on the roof of the Howard R.” Although there is a fairly strong correlation over time between production and insolation. Figure 2 shows the long term distillate production for still “B. This was done only once it was found that there was no significant difference in distillate production due to the glass cover manufacturer.5 cm.5E+07 10 .14° W). The input layer consists of the weather data and distilland volume while the output layer consists of the daily distillate production. Furthermore. Daily Distillate Production (L/m2) Daily Total Insolation (J/m2) The original solar still experimental data was collected between February 2006 and August 2007 using single basin solar stills from two different manufacturers (15). Still “A” has two inlets for filling and drainage and has a glass cover slope of 2°. the original data set was merged regardless of distilland volume and glass cover type. ANN MODELING Figure 3 illustrates an example ANN architecture composed of one input. The daily production and distillate volume data was compiled with the weather data for each respective day. Solar still production data presented in this study was gathered from two solar stills known as still “A” and “B.” Solar still “A” has a rectangular basin with an area of 0. and distilland volume. the variation in distilland volume would improve the training process and create better results for the ANN model.976 m2 and a body composed of sheet aluminum with 2. Daily distillate production data was collected by directly measuring the volume of produced distillate at approximately 0800 hours for the duration of the study. clamping the glass against the d-section seal bonded to the fiberglass box (15). Las Vegas (36.5 miles south of the test site. Prior to the training process. there are a quite a few cases where solar still distillate production varies considerably for the same amount of insolation.8E+07 8 2. The sealing is in the form of a u-channel molding which wraps around the perimeter of the still. This variation is hypothesized to occur due to the effects that other weather variables have on still performance. Still “B” has two inlets for filling and drainage and has a glass cover slope of 9°. one hidden. The training process was repeated with different combinations of weather variables until the best and most efficient combination was discovered. cover glass. The normalization accelerates the training process and is able to enhance the generalization capabilities. In order to reduce the number of scenarios needed to model all cases of distilland volume and glass cover type.3.4E+07 4 7. 1. National Weather Service (NWS) station located at the McCarran International Airport which is located 1. 80% of each data set was used for training purposes and the remaining 20% was used to test the predictive capabilities after each training trial.11° N. The original distillate production data were collected in a series of several experiments which varied the solar still type. all of the input and target variables were normalized between 0 and 1 by dividing each record by the maximum value respective to each variable. Hughes College of Engineering building at the University of Nevada. There was a brief break in still operation for still “A” between July and November 2006. 3. The hourly data were processed to generate daily average values that were used as input for the ANNs. coated with FDA approved silicone sealant in layers with un-bonded glass fiber cloth for insulation (15). Solar still “B” has a rectangular basin area of 0. This modification was necessary to prevent a sudden numerical change when the prevailing wind direction changed between northwesterly and northeasterly.S. and one output layer.767 m2 and a fiberglass exterior with foamed in place insulation.0E+06 2 0.1E+07 6 The average daily wind direction was modified to range between 0 to +/. The hourly weather data that was obtained from the U.0E+00 Dec-05 Jun-06 Jan-07 Jul-07 0 Date of Operation Insolation Distillate Production Fig 2: Solar still “B” long term production (15) 4. MATERIALS AND METHODS 2. thick polyisocyanurate foam board. 115.

and C = cloud cover).7% of predicted results being within 20% of the actual value and with 96.8% 88.2% 83.3% 54.” the ITVW scenario can be used as the simplest ANN architecture to achieve optimum results.3% 83.7% of the variance being explained.1% and 4. For still “B.974 0. The trials then continued on with the remaining weather variables (I = insolation.3% 82.3% 83.934 0.1% higher for the best performing ANN model when compared to the MVR model.9% 87.952 0. The ITVW.2% 0.9% 83.” the MVR model exhibited the best performance with the ITVW model with 84. TABLE 1: R2 VALUES AND ANN MODEL RESULTS FOR STILL “A” AND “B” STILL “A” ANN Model Inputs Fig 3: Example ANN architecture I 5.917 0.0% 82. For future applications with still “A. the coefficient of determination.914 0. Furthermore.1% of predictions being within 20% of the actual value TABLE 2: R2 VALUES AND MVR MODEL RESULTS FOR STILL “A” AND “B” STILL “A” ANN Model Inputs I IT ITV ITVW ITVWD ITVWDC 2 STILL “B” 2 R Percent of predictions within 20% 78. T = temperature.9% of the variance being explained.7% 82.0% and 2.952 0.1% R Percent of predictions within 20% 73.9% 2 STILL “B” 2 R Percent of predictions within 20% R Percent of predictions within 20% The ANN and MVR methods were run for the same six combinations of input variables.and with 91.919 0. Furthermore. the coefficient of determination was 2.904 0.915 0.937 74.7% 84.914 0.2% 89.1% 89.937 0.6% of predictions being within 20% of the actual value and with 95.969 0.0% 0. D = wind direction. W = wind speed.915 0.881 0.952 0.2% 89.970 0. was also computed to determine the percentage of variance explained by each model. Table 1 shows still “B” having an optimum performance with the ITV architecture with 88.1% of the optimum predictions being within 20% of the actual value for stills “A” and “B” respectively.938 0.8% 83. The criteria for evaluating the results from the ANN and MVR model was the percentage of model predictions that were within 20% of the actual daily distillate production.2% of the variance being explained. V = distilland volume.2% of predicted results being within 20% of the actual value and with 93.1% 83. The number of inputs was increased one at a time until the full combination of inputs was achieved.889 0. Another advantage to using ANNs over MVR is the ability to produce better results with less input variables.6% 84.952 0. Table 2 shows the MVR model exhibited the best performance with the ITVWDC inputs for still “A” with 84. It can be seen from Table 1 that the ANN model for still “A” has an optimum performance with the ITVW and ITVWDC architecture with 89.7% of the variance being explained. From .924 0. and ITVWDC architectures had higher coefficients of determination.953 Comparison of results in Tables 1 and 2 shows that the ANN model performed better by having an extra 5.974 40.4% 84. The results for both models are shown in Tables 1 and 2. ITVWD.947 0. R2. RESULTS AND DISCUSSION IT ITV ITVW ITVWD ITVWDC 0.0% 84. Modeling trials were started with insolation as the only input given the broad correlation observed in Figure 2. but had fewer predictions falling within 20% of the actual distillate production.0% 84.

” Daily Production (L/m2) 10 8 6 4 2 0 Daily Production (L/m2) 10 8 6 4 2 0 ITVW Actual ITVW Predicted Fig 5: Still “A” ANN ITVW model 10 8 6 4 2 0 Daily Production (L/m2) ITVW Actual ITVW Predicted Fig 6: Still “B” MVR ITVW model Daily Production (L/m2) 10 8 6 4 2 0 ITVWDC Actual ITVWDC Predicted Fig 4: Still “A” MVR ITVWDC model . ITV Actual ITV Predicted Fig 7: Still “B” ANN ITV model . the MVR model reaches its maximum performance for still “A” and “B” with the ITVWDC and ITVW models. the ANN model is capable of generating better predictions while also requiring fewer inputs. so it is not known if still “B” would have produced the July 2007 anomalies observed for still “A.” Still “B”’s dataset does not extend through to July 2007. Figures 6 and 7 also show fairly good agreement between both model predictions and measurements for still “B. respectively. Meanwhile. These assumptions are not necessary for the ANN model.7 below compare predicted and measured distillate yields over time for both solar stills for the best ANN and the best MVR models that were identified in Tables 1 and 2. still “A” and “B” obtain their maximum results with the ITVW and ITV architecture. The need for extra input variables to achieve better results translates into more time and energy needed to process the data. Figures 4 . The MVR model requires the user to make assumptions about the linear or non-linear relationships between input and output variables. In summary.Table 1. respectively. Figures 4 and 5 show that both model prediction sets track solar still “A”’s distillate measurement fairly well but are unable to predict the anomalously high and low distillate values that were observed in July 2007.

7% of predicted results within 20% of actual production with an R2 of 0.6. (5) Tiwari..” the ANN model generated higher R2 values for all six input cases and a higher percentage of predictions within 20% of actual production for one of the six cases. Performance Evaluation of Single and Double Basin Solar Stills in Las Vegas. 2006. and Tripathi. International Developments in Heat Transfer. Solar distillation: A Promising Alternative for Water Provision With Free Energy.. B.. Proc. S. For solar still “B. Hinton. 2007. (6) Sodha. Elsherbiny. Transient Analysis of Solar Stills in the Presence of Dye. D. Parallel Distributed Processing: Explorations in the Microstructure of Cognition. R. Desalination. Eibling. Simple Multiple Wick Solar Still: Analysis and Performance... Macmillan. MVR generated a maximum of 84. ASME.. R. (7) Abu-Hijleh.. 1993. (9) Lof. S.937.. International Heat Transfer..2% of predicted results within 20% of the actual production for still “A” with an R2 of 0....6% of predicted results within 20% of actual production with an R2 of 0. G. NV. Desalination. (13) Haykin.. Simple Technology. Vol. A Numerical Model and Experimental Investigation for a Solar Still in Climactic . (3) Fath..1% of predicted results being within 20% of actual production with an R2 of 0. (2) Tiwari. 1994. G.952.. S. Solar Water Distillation. Kumar. and Tiwari. (15) Venkatesh. Learning Internal Representations by Error Propagation. and Tyagi. and Ghazy. H. Dissertation. M... and a Clean Environment. 2003. and Meukam. H. University of Nevada. Gupta. and Elsherbiny. (11) Dunkle.. Energy Conversion and Management. R.S. 1998. Singh. 1969. Solar Energy. Artificial Neural Network models (ANNs) generally produced more accurate results than the multivariate regression models (MVRs) in four out of the six cases (either better R2 or higher proportion of results within 20% of actual production) for solar still “A” and also produced the best results using fewer input variables than MVR. A. 1996.. Effect of Adding a Passive Condenser on Solar Still Performance.. Cambridge. Journal of Renewable Energy. H.. Desalination. The ANN model generated a maximum of 88.. and Williams. S. REFERENCES (1) Fath. Solar Energy. Energy Balances in Solar Distillation. S. Neural Networks: A Comprehensive Foundation. A Naturally Circulated Humidifying/Dehumidifying Solar Still With a Built-In Passive Condenser. (10) Cooper. P. American Institute of Chemical Engineers Journal. and Blowemer. A.. Desalination. H.. The ANN models generated a maximum of 89. R.E. Enhanced Solar Still Performance Using Water Film Cooling of the Glass Cover. G. (12) Tiwari. G. MVR generated a maximum of 84. (4) Fath. 1989. 2004. N.. Energy Conversion and Management. M. CONCLUSIONS Conditions in Abidjan (Cote d’Ivoire). Digital Simulation of Transient Solar Still Processes.. J. Tiwari. S. 1961. 7. Present Status of Solar Distillation. G.. Las Vegas. MIT Press. G. 1961. (8) Toure.969. New York. the Roof Type Still and Multiple Effect Diffusion Still. Solar Energy. 1986. 1. A. and Lawrence. 1981.. Effect of Water Depths on Heat and Mass Transfer in a Passive Solar Still: In Summer Climactic Condition. 1997. P.917. (14) Rumelhart.

Sign up to vote on this title
UsefulNot useful