Classification in Chameleon Statistics
Consider a set of training data samples in which each sample is labelled as
belonging to one of several pre-specified ‘classes’. Data classification
then involves the assignment of newly presented data samples to the classes on
the basis of mathematical 'models' built for each of the classes.
For
example, here we have five pre-defined classes, and one test point which must be
assigned to one of the classes. Although this particular case is simple, when
data is noisy or classes are not well-separated, perfect classification becomes
impossible. The goal then is to choose the model which minimizes the
classification error. For any given classification problem there exists a
fundamental limit to the classification accuracy achievable, and this minimum
error rate is called the "Bayes error" for that problem.
There are two basic types of classifier, namely (a) those which attempt to
minimise the error rate without regard to density estimation, such as (i) neural
networks,
(ii) decision trees, and (iii) support vector machines, and (b) those which use
density estimates to derive a classification, such as (i) nearest-neighbor
methods, (ii) (Gaussian) mixture models, and (ii) kernel-based methods.
The former methods (a) give only
the class assignment, while the latter methods (b) also give the likelihood of a
sample belonging to each class. This means that the former methods, despite
often giving good classification accuracy, are not recommended for use when
accountability is essential (e.g. medical image analysis) or when ranked
probabilities are required (e.g. speech recognition).