Computing Publications

Publications Home » Large Scale Data Mining: Challeng...

Large Scale Data Mining: Challenges and Responses

Jaturon Chattratichat, John Darlington, Moustafa Ghanem, Yike Guo, Harald Huning, Martin Kohler, Janjao Sutiwaraphun, Hing Wing To, Dan Yang

Conference or Workshop Paper
KDD 1997: The Third ACM SIGKDD International Conference of Knowledge Discovery and Data Mining
August, 1997
ACM Press
ISBN 978-1-57735-027-9

Data mining over large data sets is considered to be a very important research subject due to its obvious commercial potential. However, it is also a major challenge due to its complexity and computational intensity. Exploiting the inherent parallelism of data mining algorithms provides a direct solution by utilising the large data retrieval and processing power of parallel architectures. In this paper, we classify various data mining algorithms with respect to their most effective parallel structure. We study induction based classification algorithms, neural networks, clustering algorithms and genetic algorithms. This classification is based on our intensive research on the parallelisation of data mining algorithms. We also present a methodology for determining the proper parallelisation strategy based on the idea of algorithmic skeletons and performance modelling. This research aims to provide a systematic way to develop parallel data mining algorithms and applications.


BibTEX file for the publication built & maintained by Ashok Argent-Katwala.