Confusion matrix for two possible True positive rate: proportion of Youden's index: arithmetic mean (Cumlative) Lift chart plot of the outcomes p (positive) and n actual positives which are predicted between sensitivity and specificity true positive rate as a function of the (negative) positive sensitivity - (1 - specificity) proportion of the population being TP / (TP + FN) predicted positive, controlled by some Actual Matthews correlation correlation classifier parameter (e.g. a threshold) True negative rate: proportion of between the actual and predicted p n Total actual negative which are predicted (TP . TN FP . FN) / true false negative ((TP+FP) (TP+FN) (TP + FP) (TN+FN))1/2 p' P positive postive TN / (TN + FP) comprised between -1 and 1 Predicted false true n' N Discriminant power normalised negative negative Positive likelihood: likelihood that a likelihood index total P' N' predicted positive is an actual positive sqrt(3) / . Classification accuracy sensitivity / (1 - specificity) (log (sensitivity / (1 specificity)) + (TP + TN) / (TP + TN + FP + FN) log (specificity / (1 - sensitivity))) Error rate Negative likelihood: likelihood that a <1 = poor, >3 = good, fair otherwise Relationships (FP + FN) / (TP + TN + FP + FN) predicted negative is an actual negative Graphical tools sensitivity = recall = true positive rate Paired criteria specificity / (1 - sensitivity) specificity = true negative rate ROC curve receiver operating BCR = . (sensitivity + specificity) Precision: (or Positive predictive value) Combined criteria characteristic curve : 2-D curve BCR = 2 . Youden's index - 1 proportion of predicted positives which parametrized by one parameter of the F-measure = F1measure are actual positive BCR: Balanced Classification Rate classification algorithm, e.g. some Accuracy = 1 error rate TP / (TP + FP) (TP / (TP + FN) + TN / (TN + FP)) threshold in the true postivie rate / BER: Balanced Error Rate, or HTER: false positive rate space References Recall: proportion of actual positives Half Total Error Rate: 1 - BCR AUC The area under the ROC is which are predicted positive between 0 and 1 Sokolova, M. and Lapalme, G. 2009. A TP / (TP + FN) F-measure harmonic mean between systematic analysis of performance precision and recall measures for classification tasks. Inf. 2 (precision . recall) / Process. Manage. 45, 4 (Jul. 2009), Sensitivity: proportion of actual (precision + recall) 427-437. positives which are predicted positive F-measure weighted harmonic mean Demsar, J.: Statistical comparisons of TP / (TP + FN) between precision and recall classifiers over multiple data sets. (1+)2 TP / ((1+)2 TP + 2 FN + FP) Journal of Machine Learning Research Specificity: proportion of actual 7 (2006) 130 negative which are predicted negative The harmonic mean between specificity TN / (TN + FP) and sensitivity is also often used and sometimes referred to as F-measure. Regression performances measure cheat sheet Damien Franois v0.9 - 2009 (damien.francois@uclouvain.be)
Let be a set of Absolute error Robust error measures Resampling methods
input/output pairs and a MAD Mean Absolute Deviation Median Squared error LOO Leave-one-out: build the model function such that for , on data elements and test on the remaining one. Iterate times to MAPE Mean Absolute Percentage Error -trimmed MSE collect all and compute mean error.
Squared error X-Val Cross validation. Randomly
Predicted error where is the set of residuals split the data in two parts, use the SSE Sum of Squared Errors, or where percents of the largest first one to build the model and the RSS Residual Sum of Squares PRESS Predicted REsidual Sums of values are discarded. second one to test it. Iterate to get a Squares distribution of the test error of the M-estimators model.
MSE Mean Squared Error where is a matrix built by stacking
the in rows. is the vector of K-Fold Cut the data into K parts. where \rho is a non-negative function Build the model on the K-1 first parts with a mininmum in 0, like the and test on the Kth one. Iterate from GCV Generalised Cross Validation parabola, the Hubber function, or the RMSE Root Mean Squared Error 1 to K to get a distribution of the test bisquare function. error of the model.
where is a matrix built by stacking Bootstrap Draw a random subsample
Graphical tool the in rows. is the vector of of the data with replacement. Compute NMSE Normalised Mean Squared Error the error on the whole dataset minus Plot of predicted value against actual Information criteria the training error of the model and value. A perfect model places all dots Iterate to get a distribution of such where var is the empirical variance in on the diagonal. AIC Akaike Information Criterion values. The mean of the distribution is the sample. the optimism. The bootstrap error where is the number of parameters estimate is the training error on the R-squared whole dataset plus the optimism. in the model
BIC Bayesian Information Criterion
where var is the empirical variance in the sample where is the number of parameters in the model