Trading MIPS and memory for knowledge engineering
Communications of the ACM
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
The effect of adding relevance information in a relevance feedback environment
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the ninth international conference on Information and knowledge management
Adaptive Filtering of Newswire Stories using Two-Level Clustering
Information Retrieval
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
IEEE Transactions on Knowledge and Data Engineering
Paper: Nearest neighbor classification on two types of SIMD machines
Parallel Computing
A new nearest neighbor rule for text categorization
CIARP'06 Proceedings of the 11th Iberoamerican conference on Progress in Pattern Recognition, Image Analysis and Applications
Support vector machines for spam categorization
IEEE Transactions on Neural Networks
Using typical testors for feature selection in text categorization
CIARP'07 Proceedings of the Congress on pattern recognition 12th Iberoamerican conference on Progress in pattern recognition, image analysis and applications
Interactive data mining on a CBEA cluster
HPCS'09 Proceedings of the 23rd international conference on High Performance Computing Systems and Applications
Hi-index | 0.00 |
In this paper we describe the parallelization of two nearest neighbour classification algorithms. Nearest neighbour methods are well-known machine learning techniques. They have been successfully applied to Text Categorization task. Based on standard parallel techniques we propose two versions of each algorithm on message passing architectures. We also include experimental results on a cluster of personal computers using a large text collection. Our algorithms attempt to balance the load among the processors, they are portable, and obtain very good speedups and scalability.