Abstract
Gaussian distribution has for several decades been ubiquitous in the theory and practice of statistical classification. Despite the early proposals motivating the use of predictive inference to design a classifier, this approach has gained relatively little attention apart from certain specific applications, such as speech recognition where its optimality has been widely acknowledged. Here we examine statistical properties of different inductive classification rules under a generic Gaussian model and demonstrate the optimality of considering simultaneous classification of multiple samples under an attractive loss function. It is shown that the simpler independent classification of samples leads asymptotically to the same optimal rule as the simultaneous classifier when the amount of training data increases, if the dimensionality of the feature space is bounded in an appropriate manner. Numerical investigations suggest that the simultaneous predictive classifier can lead to higher classification accuracy than the independent rule in the low-dimensional case, whereas the simultaneous approach suffers more from noise when the dimensionality increases.
Original language | Undefined/Unknown |
---|---|
Pages (from-to) | 73–102 |
Number of pages | 30 |
Journal | Journal of Classification |
Volume | 33 |
Issue number | 1 |
DOIs | |
Publication status | Published - 2016 |
MoE publication type | A1 Journal article-refereed |
Keywords
- Bayesian modeling
- Discriminant analysis
- Inductive learning
- Predictive inference
- Probabilistic classification