Real-time feature selection technique with concept drift detection using adaptive micro-clusters for data stream mining

[thumbnail of paper.pdf]
Preview
Text - Accepted Version
· Available under License Creative Commons Attribution Non-commercial No Derivatives.
· Please see our End User Agreement before downloading.
| Preview

Please see our End User Agreement.

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Hammoodi, M. S., Stahl, F. orcid id iconORCID: https://orcid.org/0000-0002-4860-0203 and Badii, A. (2018) Real-time feature selection technique with concept drift detection using adaptive micro-clusters for data stream mining. Knowledge-Based Systems, 161. pp. 205-239. ISSN 0950-7051 doi: 10.1016/j.knosys.2018.08.007

Abstract/Summary

Data streams are unbounded, sequential data instances that are generated with high Velocity. Classifying sequential data instances is a very challenging problem in machine learning with applications in network intrusion detection, financial markets and applications requiring real-time sensor-networks-based situation assessment. Data stream classification is concerned with the automatic labelling of unseen instances from the stream in real-time. For this the classifier needs to adapt to concept drifts and can only have a single pass through the data if the stream is fast moving. This research paper presents work on a real-time pre-processing technique, in particular feature tracking. The feature tracking technique is designed to improve Data Stream Mining (DSM) classification algorithms by enabling and optimising real-time feature selection. The technique is based on tracking adaptive statistical summaries of the data and class label distributions, known as Micro-Clusters. Currently the technique is able to detect concept drifts and identify which features have been influential in the drift.

Altmetric Badge

Item Type Article
URI https://reading-clone.eprints-hosting.org/id/eprint/78678
Identification Number/DOI 10.1016/j.knosys.2018.08.007
Refereed Yes
Divisions Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
Uncontrolled Keywords Data Stream Mining, real-time Feature Selection, Concept Drift Detection
Publisher Elsevier
Download/View statistics View download statistics for this item

Downloads

Downloads per month over past year

University Staff: Request a correction | Centaur Editors: Update this record

Search Google Scholar