Stahl, F. ORCID: https://orcid.org/0000-0002-4860-0203, Bramer, M. and Adda, M.
(2009)
PMCRI: a parallel modular classification rule induction framework.
In:
Machine Learning and Data Mining in Pattern Recognition.
Lecture Notes in Computer Science (5632).
Springer, pp. 148-162.
ISBN 9783642030697
doi: 10.1007/978-3-642-03070-3_12
Abstract/Summary
In a world where massive amounts of data are recorded on a large scale we need data mining technologies to gain knowledge from the data in a reasonable time. The Top Down Induction of Decision Trees (TDIDT) algorithm is a very widely used technology to predict the classification of newly recorded data. However alternative technologies have been derived that often produce better rules but do not scale well on large datasets. Such an alternative to TDIDT is the PrismTCS algorithm. PrismTCS performs particularly well on noisy data but does not scale well on large datasets. In this paper we introduce Prism and investigate its scaling behaviour. We describe how we improved the scalability of the serial version of Prism and investigate its limitations. We then describe our work to overcome these limitations by developing a framework to parallelise algorithms of the Prism family and similar algorithms. We also present the scale up results of a first prototype implementation.
Altmetric Badge
Additional Information | Proceedings of the 6th International Conference, MLDM 2009, Leipzig, Germany, July 23-25, 2009. |
Item Type | Book or Report Section |
URI | https://reading-clone.eprints-hosting.org/id/eprint/30146 |
Item Type | Book or Report Section |
Refereed | Yes |
Divisions | Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science |
Additional Information | Proceedings of the 6th International Conference, MLDM 2009, Leipzig, Germany, July 23-25, 2009. |
Publisher | Springer |
Download/View statistics | View download statistics for this item |
University Staff: Request a correction | Centaur Editors: Update this record