Assessment of statistical classification rules: implications for computational intelligence

  • Authors:
  • Murray H. Loew;Robert F. Wagner;Waleed Ahmed Yousef

  • Affiliations:
  • The George Washington University;The George Washington University;The George Washington University

  • Venue:
  • Assessment of statistical classification rules: implications for computational intelligence
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of binary classification is of great interest across many fields, including data mining, satellite imaging, and medical diagnostics. The performance of a classifier is, mostly, measured in terms of the error rate, i.e., the total probability of misclassification. A more general approach is to use the Receiver Operating Characteristic (ROC). The ROC is a plot of all the possible values of one type of errors versus the other one. Very practical and easy-to-interpret summary measures can be derived from such a curve, e.g., the Area Under the Curve (AUC), and the Partial Area Under the Curve (PAUC). This dissertation studies the assessment of classification rules using the entire ROC space with no parametric assumptions. That is, the present approach requires having no knowledge about the distribution of the data. In addition, when data are scarce the classification rule should be designed and assessed from the single available data set. In the present regulatory setting for public-policy making, e.g., in the area of medical diagnostics, the available data set is required to be split into two disjoint sets, one for design and the other for assessment. In this dissertation, both strategies are studied. Moreover, the techniques developed in this dissertation assume no particular form for the classification rule to be assessed: the methodology is general across classical as well as novel modern architectures. The linear and quadratic discriminants used in the dissertation were selected, simply, for demonstration purposes. The contemporary use of the expression Computational Intelligence refers to a number of rapidly maturing branches of the general field of artificial intelligence, including neural networks, fuzzy logic, evolutionary algorithms,...etc. Algorithms developed in these subfields to solve classification problems are, from a statistical point of view, nonparametric classification rules. This dissertation may, therefore, provide critical assessment tools for such algorithms when they must be developed within a setting in which data are scarce.