Clustering and classification of large document bases in a parallel environment
Journal of the American Society for Information Science
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
A parallel learning algorithm for text classification
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Effect of term distributions on centroid-based text categorization
Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Informatics and computer science intelligent systems applications
Hi-index | 0.00 |
In this paper, we propose a multi-dimensional category model (MDCM) for classifying multi-dimensional text collection. We can parallel and distribute the process of text classification in separately on each dimension. With this model, performance of classifiers improves in both accuracy and time complexity. For classification accuracy, some benefits can be obtained. Classifiers learn from larger training documents with a small number of classes on each dimension. We can select the best classifier for each dimension and combine the results from them. For time complexity, the learning and classifying phases can be in parallel and distributed manner. The efficiency of MDCM is investigated on drug information data set which assigns topics in monographs in the first dimension and primary therapeutic classes in the second dimension. The experimental results show that parallel text classification on MDCM performs better than flat model in both accuracy and time complexity.