Hybrid dual-resampling and cost-sensitive classification for credit risk prediction

Full text not archived in this repository.

Please see our End User Agreement.

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Osei-brefo, E., Mitchell, R. and Hong, X. orcid id iconORCID: https://orcid.org/0000-0002-6832-2298 (2023) Hybrid dual-resampling and cost-sensitive classification for credit risk prediction. In: AI-2023 Forty-third SGAI International Conference on Artificial Intelligence, 12-14 DECEMBER 2023, Cambridge, England. (In Press)

Abstract/Summary

The class imbalance in financial data sets is prevalent and problematic when evaluating credit risks. This paper proposes a Hybrid dual Resampling and Cost Sensitive classification approach by creating heuristically balanced data sets. Given an imbalanced credit data set, a synthetic minority class is generated using a resampling learning technique based on Gaussian mixture modelling from the minority class data. Simultaneously, k-means clustering is applied to the majority class. Then, feature selection is performed using an Extra Tree Ensemble technique. Finally, a cost-sensitive logistic model is estimated and applied to predict probability of default using the heuristically balanced datasets. The results show that the proposed technique achieves superior performance in comparison with other imbalanced preprocessing approaches.

Item Type Conference or Workshop Item (Paper)
URI https://reading-clone.eprints-hosting.org/id/eprint/113068
Refereed Yes
Divisions Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
Download/View statistics View download statistics for this item

University Staff: Request a correction | Centaur Editors: Update this record

Search Google Scholar