Evaluating text categorization
HLT '91 Proceedings of the workshop on Speech and Natural Language
Fast discovery of association rules
Advances in knowledge discovery and data mining
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Concept-based knowledge discovery in texts extracted from the Web
ACM SIGKDD Explorations Newsletter
Evaluating the novelty of text-mined rules using lexical knowledge
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Discovery of Multiple-Level Association Rules from Large Databases
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
A Novel Method of Automobiles' Chinese Nickname Recognition
ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
PKUNEI --- A Knowledge---Based Approach for Chinese Product Named Entity Semantic Identification
ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
Hi-index | 0.00 |
Data mining is typically applied to large databases of highly structured information in order to discover new knowledge. In businesses and institutions, the amount of information existing in repositories of text documents usually rivals or surpasses the amount found in relational databases. Though the amount of potentially valuable knowledge contained in document collections can be great, they are often difficult to analyze. Therefore, it is important to develop methods to efficiently discover knowledge embedded in these document repositories. In this paper we describe an approach for mining knowledge from text collections by applying data mining techniques to metadata records generated via automated text categorization. By controlling the set of metadata fields as well as the set of assigned categories we can customize the knowledge discovery task to address specific questions. As an example, we apply the approach to a large collection of product reviews and evaluate the performance of the knowledge discovery.