You are on page 1of 11
A Systematic Fuzzy Decision-Making Process to Choose the Best Model Among a Set of Competing Models Hamed Shakouri G. and Mohammad B. Menhaj Abstract—The assessment of a theory is the main objective of scientists, Theories are always introduced by models, and model selection is applied to many various fields of scientific studies in order to corroborate or verify the theory as the winning one among a set of competing hypotheses. Different criteria are taken as bases to select one model among several parallel models in both statistical and visual types. This paper proposes a new method in ‘model selection based on the solutions ofa fuzzy decision-making problem. The method enables us to apply systematicall validation criteria by defining @ proper possibility distribution function (PDE) for each criterion. The generality of the method lows us to consider even intuitive, inaccurate, or linguistic criteria. Finally, the maximization ofa utility function, rationally composed of the PDFs, will determine the best choice of competing models, ‘The method is ilustrated by two sets of linear and nonlinear parallel models. Index Terms—Kuzzy decision making (FDM), model selection, ‘model validation, multicriteria model selection. L. IntRopuction HERE is no doubt that the growth of science isin debt of theories that have been challenging during the history of science, Whatever the theories are, there are always some state- ‘ments describing the reality that is not univocally determined. ‘Therefore, they should be somehow verified, i, “we have to make some type of decision in order to prefer one statement (theory/model) against the others” [1]. This is the preferentilist approach, ‘Mathematical modeling and experimental modeling are the ‘most common tools fora scientist to study phenomens (2), [3] Modeling process usually contains several stages including: 1) specification of factors affecting the phenomenon; 2) nom- inating some mathematical interrelations; 3) parameter estima- tion based on the experimental data; and finally, 4) validation of the candidate model(s). Various models may be adjusted to simulate or prediet properly the same phenomenon, and each validated to some extent, In addition, there are many efforts in this field, and different methods have been proposed to ‘Manuscript received June 14, 2006; evsed Apel 14, 2007, September 30, 2007, sd December 24,2007. This paper was recommended by Associate TH. Shakour G.is withthe Insta! Engineering Depactment, Engineering acuity Unversity of Tehran, Tehran, fan (emi hhakour tt) MB. Menhaj swath the Fletcal Enginoerng Faull, Amik Univer. sity of Technology, Tehran {3597-45778 ra e-mail menhaj@ wt a6) Digital Object Ider 10.1109/TSMCA 2008 2001075 determine the winner model among several competing ones, ‘which may be referred to as scientific productivity [4] ‘Almost all solutions to the model selection problem are obtained by consideration of a certain criterion, out of a set of various criteria, usually based on the probability theory. Whether the phenomenon is an economic one or raised by an engineering system, there is no difference. Furthermore, in computer science and machine learning, where inductive logic programming has an important role, improving model validation and selection techniques is @ necessity [S]. A class of model selection methods can be found in the literature of support vector machine, a set of which is listed in [6] However, there are always many properties for the assess- ment of @ model. The well-known explanatory eriteria (2), Kullback’s information erterion (KIC), Akaike’s final predic- tion error (FPE), Grasa’s average variance extracted (AVE) criterion, etc, are the most cited criteria among so many others introduced by pioneer researchers. One may add both the predictive model information criteria of Sei (7| and the entropic model selection proposed by Tseng [8] as examples among many recent methods, which may be listed in a set of criteria measuring the accuracy level of models, There are other criteria ‘based on the characteristics of the estimated parameters, 8. t-student statistics, and some based on the residuals’ properties, ‘eg, normality test criteria [9]-{12]. These criteria are called statistical criteria, Conversely, many other eriteria are considered as visual or intuitive criteria, including visual goodness of fit (VGF) [13] relative stability and suitable time response, overall significance of parameters, appropriate dynamie properties [14], and so on. Such intuitive concepts help the designer to possess a particular ‘perception based on his/her prior knowledge about the system under study. All the well-known model selection methods in the literature 15] (e-., the Wilks’ method [2] or the likelihood ratio method of Engle [1], etc.) mostly based on a probabilistic viewpoint are not sophisticated enough to be capable of containing several criteria of different types simultaneously. There is no unified method known, accenting various kinds of criteria in order to ‘choose the best model. Usually, the model designer is the one who performs the task of model selection, and often he/she prefers to select the best one according to a certain (statistical) Criterion. Introducing a tradeoff between probability and pos- sibility theories, the proposed method provides a suitable tool to carry out model selection via fuzzy decision making (FDM) 1083 -44271825.00 © 2008 IEEE tar, esther eet ‘models, tion process to select the best model rong. Nn paral according to a user-defined fuzzy rule base, This idea comes from the fact that, usually, a mixture ofall criteria concerning, the underlying subject convinces a human being to accept a theory. The advantage of this method is exposed more in the presence of relative closeness within the competing models, ‘when intuitive selection among altematives causes obvious difficulties. “The method was first introduced briefly in [16], and then, similar approaches were applied in a few cases for specific applications [17]. This paper attempts to systematize the so- lution of a multicriteria model selection problem, as well as classifying all statistical and other intuitive or visual criteria, The rest of this paper is organized as follows, Section TL explains the problem of model selection. The application of FDM to this problem is introduced in detail in Section IL Section IV illustrates the proposed method via two examples. Finally, Section V concludes this paper. TL. PROBLEM STATEMENT In general, the process of model selection may be schemat- ically represented in Fig. 1, According to this figure, several models, known as parallel models, may describe a common phenomenon. We should note that one criterion is properly determined in advance as a basis to select the best model called the winner, Integrating a set of different criteria simultaneously in order to determine the best model, the following question arises: “Does a unified method that includes all required character istics exist?” To explain this, suppose that several models are suggested for a given system and the corresponding FPE, R?, the covariance matrix, Po, t-staistcs, etc. are calculated Now, its requested to attain the best model. Ie means that the selected model is supposed to be the most suitable model that mimics the real phenomenon, For example, one model may be ‘a good candidate based on residual characteristics but possesses ‘weak properties as far as the parameters’ validity is concerned. (On the contrary, another one may show the opposite. How can we choose one of them as fair, acceptable, or good model, and ‘what is the proper measure to clarify this concept? This paper discusses an alternative process to replace the selection stage in Fig. 1, which is represented by the FDM block shown in Fig. 2, This solution, which will be explained in detail ‘through the remaining pars of this paper, may be briefly stated as follows. First, @ set of the most important eriteria is listed. ‘Second, both conjunctive and disjunctive relations among the listed criteria are determined. Afterward, a proper possibility distribution function (PDF) is assigned to each criterion, and 4 utility function is defined as well. Maximizing the utility afro DM scheme tothe mode selection problem Fie.2 function leads to the solution of the model selection problem. One of the most important advantages of the proposed method is its ease of applicability to the nonlinear parallel model selection problem, IL. FDM FoR MopeL SELECTION Suppose that there are Ny, candidate parallel models t0 describe one specific phenomenon (system), namely, Mj: i = L,..-,Nyqs Each model, which synthesizes or predicts the variations of a certain variable in the system, is represented by the following [11} {elt — Bs 8) = (ZN, £458) o where Z™ is the set of data sampled in N’ time points, by which the model is identified, and the vector 9, contains dy estimated parameters associated with the ith model. Note that Z* includes the measurements on the modeled output variable u(t), which is predicted as j, by each model ‘An error term ¢(t), which is the so-called residual, includes all uncertainties in the model and causes deviations of the ‘model from the measured output data, This term is defined for each model as ect) = ult) ~ tale t 48) @ where 9(¢|¢ — k) isthe model output estimation at time ¢, given data available for the ¢ — & last time samples, To calculate the prediction error, f is set to one, implying that all available information, Le. all previous data samples, are in use. For the case of pure simulation, Fis set tof: to be more specific, it, in fact, tends to infinity [11] Furthermore, suppose that the accuracy, performance, and ther miscellaneous static and dynamic properties are the utilities quantified by a set of N, validity criteria ¢;(¢) Lye-yNej f= Ly-.+,Nyw- Then, le the collected eriteria val- ues for each model M; be defined in a set eveli)} The decision-making problem is now represented by a con- strained optimization of a utility function, formulated according wolI8] W= eldeat @ Max U(x) s.t.9(2) > 0 a tar, esther eet {ee == <= i Fig 3. Classification of iterate construct arule base forthe moéel election procedure, Where 2 is a vector function of Vs. ie. 2 = 2(Vs), and o( 2) refers (0 a set of restrictions on the eriteria, which will be discussed later in this paper Jn general, in human decisions, none ofthe utility functions is known to be satisfied exactly. Thus, itis preferred to apply fuzzy concepts properly to construct the utility function. In this eas, all functions in (3) will be fuzzy functions oftheir arguments, Therefore, reformulation of (@) ean be done via three distinguished steps, which are explained throughout the following sections. A. Steps to Construct the Model Selection Process At this point, a systematically designed procedure is intro- duced to establish the aforementioned decision-making prob- Jem. The procedure is explained within four steps. Step I: Collection and Classification of Criteria: collection of criteria, which is important from the certain view- point of the model designer, should be specified. If the mode! is concerned with a physical phenomenon, there may be some particular criteria of interest, whereas for a humanistic problem, some other important criteria are collected. “Moreover, the set of criteria should be organized according to the different classes that they belong. For example, all eriteria may be classified into four classes due to the following: 1) size of an error term, like the error in the objective function of parameter estimation, related to the ability Of the model to explain historical experiments via pure simulation or full information prediction; 2) time domain andor frequency domain characteristics of the residuals based on presumptions that are vital for the solution of the parameter estimation problem; 3) properties of the parameters satisfying logical relations ‘and mathematical significance criteria; 4) ability of the model to forecast, predict, and/or simulate the future truly. Fig. 3 shows such a classification of desired criteria to enable logical construction of the rule base needed at this step, where ‘properly designed combination of AND/OR relations between the rules would build up the rule base, First, a Some examples of criteria categorized into the four elasses are listed: 1) criteria du tothe model accuracy and admissibility 2) error sum squares: ESS = 5=4(0); ') loss function value: V = (1/2)334(2(0), a convex function; ©) mean percent error: M PE =100 ¥e(0|/S|u(®)|/Ns 4) r00t mean squared error: RMSE; «) Aksike's information criterion: AIC: 4). Kullback’s information criterion: KIC; ) Schwarz Bayesian information criterion: BIC; 1h) Akaike's final prediction error: FE: i) adjusted explanatory power: R?; 4) fitness of simlation: Fit, which is calculated in the same way as R?, except that the prediction error is replaced by the simulation error, 1p) wisual goodness of fit: VF. characteristics ofthe error signal (residuals: a) Jarque-Bera (JB) or Salmon-Kiefer test of normali ty: NT; by Ljung-Box independence statistics: LB; ©) Lagrange multiplier test of autocorrelation: LM; €) probablity-based measure of normality: PMN; €) frequency domain whiteness: FDW. ‘The two later criteria are proposed in this paper, which will be defined in the next section, 4) significance ofthe parameters 4) the f-studentstatistis (-ratio) due to each parameter and the minimum of t-studentstatistis; ') norm of the parameters of eovariance matrix: | Fo 6) the global T-student statistics: G1, which i formally defined in the next section; 4d) static property of parameters: sign significance; ©) dynamic property indexes: DPI, which may include overshoot, settling time, de gain, or any combination of these characteristics that can be considered as an index 4) ability t0 forecast: 1) prediction power: PP: 2) simalation power: SP. ‘These two criteria are aso suggested here and will be defined in the next section, Step 2: Designing a Rule Base: In this step, a proper com bination ofthe erteria should be designed. Logical relations, ‘conjunctions, and disjunctions (AND/OR) will be used to com: bine the chosen and categorized criteria within a rule base Note that we, asthe system designers, have of course a fimited set of desired properties in our minds, and clearly, we will face difficulty to combine criteria for deciding on the best model; however, the proposed FDM allows consideration of any possible set of rules, For example, the most acceptable decisions to select the best model may employ the following rule base that is mainly based on the linguistic variables (LVs) highlighted hereafter ‘A given model isthe best if it has the following: 1) ‘small error or high explanatory criterion; 2) good characteristics forthe residuals to admit presump- tions of the model, e.g, acceptable normality or high independence; , where £(.) is 2 tar, esther eet 3) significant parameters by means of their sign or small variance of parameters; 4) high ability to forecast the future. Such a rule base may be accomplished by some other desired rules; however, this kind of classification depends on the cri- teria classes introduced in Step 1, and usually, it represents @ sufficient base. Step 3: Possibility Measures: According to the proposed ‘method, the utility function U(z) is a possibility measure, and therefore, itis formed by PDFs associated with each eriterion, and g(2) contains restrictions on those PDFs. Thus, in the second step, these PDFs have to be defined properly. Originally, a common formulation may roughly be used to ‘compute the possibility value due to each criterion as (@) — min (¢4(¥))} max (et) — mine)” 1 (6s(8)) = BM. Nei 0 j=h This is the simplest method of fuzzticaion, which may not always lead to admissible resuls, particularly when the criteria values are too congested, ie., max(c,(#)) ~ min(es() is very small with respect tothe average of ¢()'s To apply a more sophisticated skill the distibution of each criterion is employed. Mostly, statistical criteria called stats tics, havea certain probability distribution function (pa) that is considered as a basis to derive its PDF For example, JB-statistic (b-1) has a x distribution with two degrees of freedom: thus, the PDF would be defined by Typ (2) =1- F(x) ¢ © where rea) [oa o In similar, he following PDFs may be used forthe LB (b-2),t student (eI) statistics, and the covariance matrix ofthe parameters (c-2): Tn(z) =1 - F(z) ®) oy (10) In the aforementioned equations, f("(.) and JZ" (.) are the é-student and x pals with n degree of freedom, respectively. "Note thatthe eitria (1) and (¢-2) are nonnegative (one-sided criterion), and therefore, the paf is cumulated in the interval (0,00) (9]-{111. ‘When there is no such pdf, assigning a suitable PDF is necessary. To do So, one may estimate a distribution of the jth criterion by means of the obtained values (i), using Fig.4. PDFeoferitria due tothe ear sir, eg. V, ESS, and MPE: aan optimal distribution estimation method, for example, the ‘optimal kernel method [14], [20]. For a nonnegative eriterion, which is supposed to be zero for an ideal model, this will result in the following: x(a) =1- f f@ae ap whee x is replaced by each exteron cy(), and f(€) isthe stator forthe jth pa 'Asa third way’ of obtaining a proper possibility measure, pariculriy when the values obtained fore (= Iyreny Nn tre close to each oer, we propode the ulization of patie Tar predetermined functions. The eri comesponding tothe model er, i. (1 o (td enieria, eg. V. ae often con- Centrated ina sll range close wo zero (Fig 4), and therefore, the pa estimator is quite probable ofl, One of the functions proposed here for such cases to define PDFs is asmoothededge Erape2vidl ie faction, Len Ty (v) = exp (~skv* /u) (2) ‘where vis the square error measure (a-2) for each model, with fay equal to the average value of V's, which is set to the sauddle point of the function. In addition, to distribute the V's more around their average, the regulating parameter kis com puted by = 1/var (V/max(V,) = 1/vae(V/max(¥)) as) Another PDF due to the explanatory criterion R? is de- fined as O mi as) where =j,, is a desired minimum value for the PDF of the criteria cys, determined by the model designer. After the es- lablishment of the utility function and the determination of the IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PARKT A SYSTEMS AND HUMANS, VOL. 3k NO 5, SEPTEMBER 208 Fig. 6, Curves of diferent deined PDFS restrictions, its not so hard to obtain the solution. Simply, (16) should be calculated for all competing models, and the winner model has (o oblain the maximum value U(x). Because U(x) is composed of PDFS, all values fall in [0, 1; therefore, the adjacency of the maximum U'(2") to one indicates the degree Of success of the selected model, to which it may satisfy the ‘model builder. Of course, no model can reach this ideal point; moreover, the resulting Values of U(r) depend on all models, Hence, ifthe set of competing models is changed, these values will change too. Note that the utilization of (11) suflices for such dependence. In order to save more information included in the data ‘gained by the criteria, we prefer to use the geometric mean and the arithmetic mean as conjunction and disjunction operators, instead of the conventional Min and Max operators (19) ae an leading to better and more acceptable results from the practical standpoint of this analysis, although they do not obey De “Morgan's laws, B. Measurement of Linguistic (Intuitive) Criteria Not only decision on the statistical eriteria can be made by fuzzy logic but also some criteria can be defined using fuzzy ‘concepts. It should be mentioned that the method predisposes to define new criteria that are originally based on possibility theory. The rule base may consist of some properties, e-g., “small overshoot,” “not so fast time response,” “fair whiteness Of the residuals,” “medium bandwidth for the model,” and so ‘on. PDF’ associated with such criteria are both user dependent and concept dependent; consequently, they should be defined very carefully, In addition to (11) that may generally be used, when li guistic criteria (LC) are to be dealt with, @ Weibull shape function whose maxima depend on the desited value for the LY (see Fig. 6) is suggested, Ifthe desited value is an interval, ‘a trapezoidal function, or rather (14), will be used. ‘Anyway, the definition of such criteria and the corresponding PDFS are up to the designer. For example, if ade gain of L/ a Ana tenemos feted: EEE Pabtcatoen Cperrtne Sal, Oenivaded on Getberb 200 ot: rom EEE Wore. Reto ae, Fig.7._ PDF of DPI. based on constant and constant ¢ Tins is intuitively supposed for the best model, we may define it as 2vaz an y= where 2 = DC stands for the de gain of the model. Clearly, the same definition may be used (o measure admissibility of other dynamic properties, like overshoot, time constant, etc. For a linear time invariant model, the location of system poles can be considered as a criterion. In this case, a certain region of s- plain (2-plane) is chosen as the desired zone, and a 2-D PDF may be used to evaluate the possibility of the pole location. Assuming that DPI (c-5) is an overall dynamic property index, Fig. 7 shows such a possibility measure, based on constant wn and constant ¢ lines. (On the other hand, one may also emphasize on an overall pa rameter significance to replace single f-satisies, which checks all one by one. This point is taken into consideration in this paper by defining criterion (¢-3). To do so, a new criterion, namely, GT is defined in a way similar to (¢-1) [14], [19] as. cr io) [Pall where @ is the vector of all parameters, Pp represents the covariance matrix, and ||| denotes the two-norm of a vector ‘or a matrix, This criterion can be viewed as a global T, for which the PDF is also similar to that of the f-statisics given in) mminae fish 0 at 2) Related to the error term characteristics, we also proposed two criteria (b-4) and (b-5) to measure normality and whiteness. For the former, besides the JB test, we compare an optimal estimate of the pdf [20] of the error term with the theoretical normal pdf J \fl@ - f°) ae TINCT a Here, it is confirmed that the model designer is not forced to calculate any precise PDF for such LCs, as it is expected for a statistical criteria. In fact, any carefully defined criterion, accompanying a proper PDF, like those shown in Fig. 6, can be applied to the proposed model selection method. For whiteness, a similar measurement is used. The energy in the derivative of the error power spectrum, smoothed by @ PMN ay smoothing filter, is estimated and considered a criterion, For a white noise, it should be acceptably equal to zero wherever Thus, define Aga) An PDW 2s) “|| ) is the smoothed estimate of the power spectrum ofthe error term ¢(). PDFS for both (24) and 25) are calculated by (1D, and (14) ean be used to adjust it too. It should be ‘mentioned that the residual signal may be assumed a colored noise, and it may be modeled by a particular model like the moving average iter. However, the input of that model should be replaced for c(t) and its whiteness shouldbe tested A very important criterion to accept a model is its ability to forecast the future. A set of unused data (test data) should be reserved to test the mode! and calculate the verity of the model (o imitate reality. Similar to the Theils inequality coeficient, a criterion is suggested to measure ths ability [9, [14] S20 tae x) zea 26) ee eo) where Ni, is the number of new data, which do not participate in the identification process. PP is supposed to be not so ar from one; therefore, its PDF is defined similar to that of F SP is also defined in the same way in (26), except that y(t) is calculated as the simulation error of the model, i, for & = 90 in (2), by the now data samples. IV, ILLUSTRATION OF THE METHOD This part is devoted to the illustration of the capability of the proposed method through deep investigation of an important phenomenon, which willbe given shorly. The annual ‘maximum demand for electricity is the phenomenon considered asthe case study, and two groups of linear and nonlinear models are set into competition. At last, one model, chosen as the best, describes this phenomenon. ‘The output of the system which is being modeled, namely, p(t); isthe maximum realized electric power, which is sup plied by power stations, plus the unmet demand. Because there fre no certain data about the unmet demand, this pari replaced by the summation of both blackout and an equivalent power due to frequency drops, which are not actually realized by the system [21]. The signal xn (t) is then preprocessed by a fuzzy smoothing filter. A. Linear Models Herein et of various paral linear models is constructed and compared. Let al parallel models, isin (1) be defined by j commen general mol follows w:ent)=Davenlt-9+ 04 u(+Ci(@e) 2D tar, esther eet ae IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PARKT A SYSTEMS AND HUMANS, VOL. 3k NO 5, SEPTEMBER 208 TABLE INPUT SETS APPLIED 10 THE LINEAR MODEL. (27) ND NONLINEAR MODELS (34) ings | Gxe Case? Cas Stating) SA ny Lyd Fig. 8, Data used to mods! lectricity demand it (Po). households’ purchasing power (3/pnD). pone (y/Pp) Ian changes in population nd indastes” purchasing where m indicates the number of inputs considered for each model, the argument t represents a discrete-time indicator for each year, and Cy(q) = 1+ oq is a first-order discrete-time domain polynomial of the shift (Jag) operator q, which, in companion witha white noise term e(¢), models the uncertainty ‘The 14(t) signals in (27) are listed in Table I. This table ates that there are seven inputs, each will be chosen from the ith row. Each zero in @ row means thatthe corresponding Input is set to zero, and a dash in each row means tha there is no more case for the associated input. These input signals are introduced next. ue production capacity of the industrial sector population; general price index; clectricity to fuel price index ratio, and pre = pr/pe is defined vice versa; y total income; Po Pep clectrical device price index; Pup household (residential) device price index; Pip industrial (commercial) device price index; Pat average of pup and pip indexes; unws two dummy fuzzy variables describing the war and revolution years. Moreover, A is the difference operator, m(,) is a sigmoid function, and Sp{.} represents a fuzzy smoothing filter. AIL signals are shown in Figs. 8-10, In fat, different versions of the model given in (27) may be identified by applying various sorts of input signals introduced in Table I. The different combinations of these signals generate smc 9 93810) a oss / LF pet pos. j - Staton 4 : Yess 1970 1975 1980 1985 1990 1995 2000 20058 Fig. 9. Data used to model electricity demand in Tran production capacity (vp) and otal purchasing power (y/79) in Various eases 08 | ogl _. 965” 1870 1675 1980 1985 1990 %695 2000 2005 Fig 10. Data used wo model lecicity demand in an electricity price (pe) twtr energy carie (el peice (pe), Nm = 18500 (=5 x 5x 8x 4X3 x 5 x 3) parallel models with the same structure in (27). By excluding those models that cannot satisly the sign significance criteria due to the cocllicient, we may conclude thatthe aforementioned table is ‘adequate to investigate the phenomenon. To claborate more, We allowed the first input signal u; to cover all ive different cases, simian five cases for up, tree for us, and so on. ‘The model is identilied using NV ~ 32 data samples for each endogenous variable, and N,, = 3 samples are saved to test the model by PP (SP) extern given in (26). Insigniticant Ana tenemos feted: EEE Pabtcatoen Cperrtne Sal, Oenivaded on Getberb 200 ot: rom EEE Wore. Reto ae, SHAKOUR!G. AND MENHAI SYSTEMATIC FUZZY DECISION-MAKING PROCESS ns TABLE BEST LINEAK Moots RANKED BY THe Fizz¥ MODEL SELeTION MErHob Param Ips Numb o Same o the Mal Cartes rank | ovo fu ay mm mm, wy, | TITY ¥ ae 18 r ‘ ber > | 2 3 ry as SORD— RDS 9 s [2s oss sess | nosis ones 158171829, 21530 a |r dag 4 5 ars | noise smite? stat 6 «6 fsa soa os ogi | nots goose 18s 182 sr2d efor faeries 5 ass2 | oosios — asest sas 0x08 pyar ta 3s ase nosis) zoo som 1 > faa sas ose | aoa sess 39681 07836 *Covurans andere Tor ile Tnprs Naber" due the pat nabs in Tale. where vers imply atte associaed mputs re not nladed re model ine figures and 3 nthe Sand cols ndcte tat, and oa ease rom al modes est mel TABLE PARAMETER ESTIMATION RESULTS FOR DIGVERENT CASES OF MODEL (27) « | a | zB B c iT] ae] | cea | o9 ofa oar | soso se $| “uassy | tse | olds ous | oss vionis oasis fotos aes | o4oas 3 | cosy | ca ‘models due to the parameter sign, as indicated in (c-4), are discarded, and 378 remaining models are put into the proposed fuzzy model selection process, In this regard, we applied 14 selected criteria combined in the following rule base. ‘The best model should have the following: (low F PE or very low V or high Fit or high R?) and (small GN or small PMN or about zero PDW) and (not so small fqn OF not 80 large [Pal] or very large Tz) and near one DPI and significant sign of parameters and (near one PP or near one SP). 28) ‘The dynamic properties of this model are of importance for the model builder. Their index is calculated by DPT = (DY + DM + ST)/3, where DY, DM, and ST are the pos- sibilities (utilities) due to the de gain, overshoot, and stability of the model, respectively, each obtained using an ordinary trapezoidal PDE. ‘The top ten models are presented in Table Il, where each M4th model is associated with the ith row of the tabulation. Note that the results slightly differ from those of [21] for minor 03709 oie ous revisions and some modifications in the data. Although the ‘maximum number of parameters iste, itis observed that the winner model has seven parameters. Iti clear that none of the 14 eriteria alone could individually help the model selee- tion, and they have to be joined together within the aforemen- tioned process, eg, the winner and the three succecding models have tn lower than two: however, their corresponding values of R? or DPI are higher than those of the others. ‘Table II contains the estimated parameters belonging to cach model, The closeness of the same parameters due to similar models indicates that the models are robustly estimated, and this point convinces the model designer that each one of these similar models can be considered suitable enough to explain the phenomenon acceptably to some extent. Therefore, he will have enough freedom to choose the most desired one among them, B. Nonlinear Models As mentioned carlier, the proposed model selection method is applicable to nonlinear models as well. The next example considers a set of nonlinear models of the same electricity demand (22) ‘The main difference between these two linear and nonlinear model sets is how we combine the input signals u(t). In the Ana tenemos feted: EEE Pabtcatoen Cperrtne Sal, Oenivaded on Getberb 200 ot: rom EEE Wore. Reto ae, hiae IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PARKT A SYSTEMS AND HUMANS, VOL. 3k NO 5, SEPTEMBER 208 TABLE 1V PakaMerbk EstiMAtION RESULTS POR THE NONLINEAR MODELS (27) [ats LOcLe ‘ran [BEST NONLINEAR MoDELS FOUND BY FU pas Pras | rans | rsuus | saose | wos Fras | reotz | sues | raoie | ours | arsts Fes [mos | reoos | raois | moons | uats2 gril ey epee #2 and 8S LEV ‘22¥ MODEL SELECTION METHOD, 1N COMPANY WITH THEIR CHARACTERISTICS GIVEN IN (28) PP soup oont | amar na To bors) | wast | usse | as hoses of 0x6 | arn | avis | ose ow sf vase vows | 0918 ‘hoon 4} 9 nowy | od noms nonlinear model, we replace the second term in (27) by several alternatives. Case 1) u(t) = ho (A(u/p)) [Ex(Aupe”) + & ((Apoie™)]. 29 ‘This signal contains a common price index of p for both residential and commercial sectors. Case 2) u(t) = hr (A(y/pw)) G(Aupe”") “+h (A(y/pan)) £2((Apoe™). 0) Herei we apply distinct price indexes. ‘The nonlinear functions /y(,)'s and £,(.)'s, in (29) and (30), are chosen as hi(-) = sm (Sr{-}, 15) GB) 40) = them (Sete) (32) () =2()"e" 33) where some parameters like 1,2, 0, and y may be fixed, while the others should be estimated. This creates totally 2° different ceases, The values of the fixed parameters are chosen {rom the corresponding nonfixed parameters ofthe estimated models. Furthermore, the input signals upg ean save their additive relationships, resulting in the following structure, leading to three different cases: Levalt—) + ut) + punws() + Clete) zp(t) Ga) a rae ‘We may also apply a fuzzy smoothing filter [23] to the input signals, the output signal, or both, leading to four different ‘cases. The effect of smoothing on a sample signal is shown in Fig. 9, where A(yp) is compared with S{A(yr)} ‘Therefore, the combination of all these possible cases gen- crates 2. 2° x 3x 4 = 768 nonlinear parallel models, each should be identified through the estimation of unknown pa- rameters: a, i, fi > 4 0 Ny Ye Pr and c. Hence, a full ‘model can be defined, consisting of 13 unknown parameters which, in tur, should be properly estimated, All models are fist identified (24], and 740 models are then discarded due to their weakness in some metrics, depending mostly on the insignificance of the estimated parameters, Finally, 28 models are put into competition in the proposed FDM process for which the same rule base in (28) is used to select the best model ight models are conclusively selected, ranked in a descend- ing manner according to their total utility values, (16), and the estimation results are listed in Table IV. The characteristics of ‘each model are given in Table V, by which the rule base (28) is evaluated, and finally, this terminates the model selection process using FDM. However, the model maker may choose any of the listed models asthe best one. C. Selection Among Linear and Nonlinear Models Now, we want to select a model as the best one among all linear and nonlinear models. To do so, we have chosen the top six best linear models and embedded them into the list of the ‘ight nonlinear models mentioned earlier. Then, by applying the FDM to these 14 models, the best model is obtained. New ranking results show that almost all nonlinear models stand on. top of thelist. Table VI compares the best nonlinear model with the best linear one, Note that, in this ranking, the best linear model is the second linear model in Table TI. We could expect this because, as mentioned in Section III-A, the ranking results| Ana tenemos feted: EEE Pabtcatoen Cperrtne Sal, Oenivaded on Getberb 200 ot: rom EEE Wore. Reto ae, SHAKOUR!G. AND MENHAI SYSTEMATIC FUZZY DECISION-MAKING PROCESS TABLE VI (CHARACTERISTICS OF THE TWO MODELS (LINEAR AND NONLINEAR) 204 are naturally dependent on the characteristics of all models participating in the ranking process. “There is another important point worth mentioning—the best model may not remarkably differ from the second or the third one. Therefore, itis possible for the model designer to choose manually the final acceptable model, considering some particular properties, which are of more importance, i., he/she may assign certain weightings to some of the criteria PDFs. However, the proposed method introduces a decision-making process, helping the designer who finally decides which model to pick, V. ConcLusion ‘The variety of model selection methods is too wide to be summarized in this paper, and there are too many criteria intro- ‘duced inthe literature. Mostly, they employ some form of pars ‘mony, which is a concept of optimal complexity; Bayesians use probability to choose among hypotheses, whereas some other ‘chooses among hypotheses that are equally consistent with the “observations by preferring those which are more falsifiable, etc. However, this paper proposed and demonstrated @ new ap- proach for the mathematical or experimental model selection. ‘The problem is quite familiar to scientists, who study on a clus- ter of different theories, all trying to describe one phenomenon. ‘Because there are many various points that are important 10 ‘make a decision on the admissibility of « model, this approach helps the scientist to perform the task of decision making systematically, based on the possibility theory. ‘The method provides a suitable environment to define and use any desired rule base by LVS. Itis also possible to apply new criteria implying various types of performance, including both statistical and intuitive criteria, Sometimes, the investigator has some kind of « priori knowl- edge about the underlying phenomenon (system) that she/he wr ccan include some associated criteria inthe rule base to check the acceptability ofthe selected model. The participation of statisti- cal criteria joint with other possibility-based criteria represents useful tradeoff between probability and possibility theories. Tt should be emphasized here that both a reasonable design of the rule base and a correct production of the PDEs play the most determinant roles in obtaining feasible results In cases where several competing models show relatively, close characteristics, and it is a challenging task to choose the best model, the proposed method shows a high degree of adaptability in solving the problem, ‘Two sets of linear and nonlinear models are presented to illustrate the performance of the proposed method. It has been shown that the method properly facilitates the model selection task. However, the model builder has enough authority to choose the best model, considering some descriptive andlor nonparametric criteria that are not included in the model se- lection process REFERENCES [U1 A.A. Grass, Econometric Model Selection: A New Approach, Norwell, MA: Kier, 1989, [2 I-A, Sprct) Computer Aided Modeling and Simulation. New York ‘Academic, 1982 [3] DLW. Boyd, Systems Anadis and Modeling New York: Academic, 2000, (4) J.C Haber, A new method foranalyzing scene producti” J Amer Soc. nf ck Ten, vl. $2, 0.1, pp 1089-1089, Nox. 2001 [A Aves, R. Camacho, and E. Ovi, "Model validation: A statisial- ‘sed ever of Bypotheses acceptance sp aumeccal reasoning.” in Proc. 1th In. Conf Inductive Logie Progra, Ponto, Portugl, Sep. 204, INE. Ayal, M. Chere, and C. ¥. Soe, “Automatic model selection for ‘he optimization of SVM kernels” Parte Recegnt, vo. 38, n0. 10, p.1733-1748, Oct. 200, “Bayesian prediction and mode selection fo lo cally asymptotically mixed oral moss" J Sa. Planing Inference, ‘ol 13%, no. 7, pp. 2523-2884, Jul 2007, C-¥. Teng, "Etopic ecterion for model selection.” Phys. A: Stat Theor Phys vol 370, 0.2, pp 530-838, Oct 2008. M. Vesbik, A Guide io Modern Econometric, 2nd ed. Hoboken, Ni Wiley, 200, [Hoy D. F Henry, Dynamic Econometrics 1995, (11 L Lung, System Kdenifcation: Theory forthe User. Faglewood Chis, Ni: Prentice Hal, 1987 (2) N. Gershenel, The Nanwe of Mathematica! Modeling, Cambridge, UK: Cambridge Unie Press, 1999, (U3) Ro Gentay and F. Seéuk, "A. visu goodnest-of-t test for econo rete models” in Sua in Nonlinear Dynamic Economics, so. 3 (Cambri, MA: MIT Pres, 1998, (14) IE Shakous G, “Modeling and ideatifeation of anian macroeconomic

You might also like