Di Fatta, G. and Pettinger, D. (2010) Dynamic load balancing in parallel KD-tree k-means. In: CIT '10: Proceedings of the 10th IEEE International Conference on Computer and Information Technology. IEEE, Washington DC, pp. 2478-2485. ISBN 978-0-7695-4108-2 doi: 10.1109/CIT.2010.424
Abstract/Summary
One among the most influential and popular data mining methods is the k-Means algorithm for cluster analysis. Techniques for improving the efficiency of k-Means have been largely explored in two main directions. The amount of computation can be significantly reduced by adopting geometrical constraints and an efficient data structure, notably a multidimensional binary search tree (KD-Tree). These techniques allow to reduce the number of distance computations the algorithm performs at each iteration. A second direction is parallel processing, where data and computation loads are distributed over many processing nodes. However, little work has been done to provide a parallel formulation of the efficient sequential techniques based on KD-Trees. Such approaches are expected to have an irregular distribution of computation load and can suffer from load imbalance. This issue has so far limited the adoption of these efficient k-Means variants in parallel computing environments. In this work, we provide a parallel formulation of the KD-Tree based k-Means algorithm for distributed memory systems and address its load balancing issue. Three solutions have been developed and tested. Two approaches are based on a static partitioning of the data set and a third solution incorporates a dynamic load balancing policy.
Altmetric Badge
| Item Type | Book or Report Section |
| URI | https://reading-clone.eprints-hosting.org/id/eprint/6127 |
| Identification Number/DOI | 10.1109/CIT.2010.424 |
| Refereed | Yes |
| Divisions | Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science |
| Uncontrolled Keywords | clustering, dynamic load balancing, KD Trees, parallel k-means Clustering; |
| Publisher | IEEE |
| Publisher Statement | (c) 2010 IEEE |
| Download/View statistics | View download statistics for this item |
Downloads
Downloads per month over past year
University Staff: Request a correction | Centaur Editors: Update this record
Download
Download