Towards a parallel computationally efficient approach to scaling up data stream classification

[thumbnail of Paper263.pdf]
Preview
Text - Accepted Version
· Please see our End User Agreement before downloading.
| Preview

Please see our End User Agreement.

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Tennant, M., Stahl, F. orcid id iconORCID: https://orcid.org/0000-0002-4860-0203, Di Fatta, G. and Gomes, J. B. (2014) Towards a parallel computationally efficient approach to scaling up data stream classification. In: Thirty-fourth SGAI International Conference on Artificial Intelligence, 9-11 Dec 2014, Cambridge, England, pp. 51-65.

Abstract/Summary

Advances in hardware technologies allow to capture and process data in real-time and the resulting high throughput data streams require novel data mining approaches. The research area of Data Stream Mining (DSM) is developing data mining algorithms that allow us to analyse these continuous streams of data in real-time. The creation and real-time adaption of classification models from data streams is one of the most challenging DSM tasks. Current classifiers for streaming data address this problem by using incremental learning algorithms. However, even so these algorithms are fast, they are challenged by high velocity data streams, where data instances are incoming at a fast rate. This is problematic if the applications desire that there is no or only a very little delay between changes in the patterns of the stream and absorption of these patterns by the classifier. Problems of scalability to Big Data of traditional data mining algorithms for static (non streaming) datasets have been addressed through the development of parallel classifiers. However, there is very little work on the parallelisation of data stream classification techniques. In this paper we investigate K-Nearest Neighbours (KNN) as the basis for a real-time adaptive and parallel methodology for scalable data stream classification tasks.

Item Type Conference or Workshop Item (Paper)
URI https://reading-clone.eprints-hosting.org/id/eprint/38837
Refereed Yes
Divisions Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
Publisher Springer International Publishing
Download/View statistics View download statistics for this item

Downloads

Downloads per month over past year

University Staff: Request a correction | Centaur Editors: Update this record

Search Google Scholar