Search from over 60,000 research works

Advanced Search

A kernel-based two-class classifier for imbalanced data sets

Full text not archived in this repository.
Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Hong, X. orcid id iconORCID: https://orcid.org/0000-0002-6832-2298, Chen, S. and Harris, C.J. (2007) A kernel-based two-class classifier for imbalanced data sets. IEEE Transactions on Neural Networks, 18 (1). pp. 28-41. ISSN 1045-9227 doi: 10.1109/TNN.2006.882812

Abstract/Summary

Many kernel classifier construction algorithms adopt classification accuracy as performance metrics in model evaluation. Moreover, equal weighting is often applied to each data sample in parameter estimation. These modeling practices often become problematic if the data sets are imbalanced. We present a kernel classifier construction algorithm using orthogonal forward selection (OFS) in order to optimize the model generalization for imbalanced two-class data sets. This kernel classifier identification algorithm is based on a new regularized orthogonal weighted least squares (ROWLS) estimator and the model selection criterion of maximal leave-one-out area under curve (LOO-AUC) of the receiver operating characteristics (ROCs). It is shown that, owing to the orthogonalization procedure, the LOO-AUC can be calculated via an analytic formula based on the new regularized orthogonal weighted least squares parameter estimator, without actually splitting the estimation data set. The proposed algorithm can achieve minimal computational expense via a set of forward recursive updating formula in searching model terms with maximal incremental LOO-AUC value. Numerical examples are used to demonstrate the efficacy of the algorithm.

Altmetric Badge

Item Type Article
URI https://reading-clone.eprints-hosting.org/id/eprint/15274
Item Type Article
Refereed Yes
Divisions Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
Uncontrolled Keywords Forward selection, imbalanced data sets, kernel classifier, leave-one-out (LOO) cross validation, receiver operating characteristics (ROCs)
Download/View statistics View download statistics for this item

University Staff: Request a correction | Centaur Editors: Update this record

Search Google Scholar