Subtopic structuring for full-length document access
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Chinese text segmentation for text retrieval: achievements and problems
Journal of the American Society for Information Science
Automated learning of decision rules for text categorization
ACM Transactions on Information Systems (TOIS)
ACTS: an automatic Chinese text segmentation system for full text retrieval
Journal of the American Society for Information Science
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Combining classifiers in text categorization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Training algorithms for linear text classifiers
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Context-sensitive learning methods for text categorization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Chinese text retrieval without using a dictionary
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Mining Text Using Keyword Distributions
Journal of Intelligent Information Systems
A new statistical formula for Chinese text segmentation incorporating contextual information
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A Web text mining approach based on self-organizing map
Proceedings of the 2nd international workshop on Web information and data management
Self-Organizing Maps
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Exploiting Hierarchy in Text Categorization
Information Retrieval
Automatic Text Categorization and Its Application to Text Retrieval
IEEE Transactions on Knowledge and Data Engineering
TopCat: Data Mining for Topic Identification in a Text Corpus
PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
The Cluster-Abstraction Model: Unsupervised Learning of Topic Hierarchies from Text Data
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
ECDL '97 Proceedings of the First European Conference on Research and Advanced Technology for Digital Libraries
Automatic Text Theme Generation and the Analysis of Text Structure
Automatic Text Theme Generation and the Analysis of Text Structure
Knowledge-based automatic topic identification
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Feature selection and feature extraction for text categorization
HLT '91 Proceedings of the workshop on Speech and Natural Language
Construction of supervised and unsupervised learning systems for multilingual text categorization
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
Recently research on text mining has attracted lots of attention from both industrial and academic fields. Text mining concerns of discovering unknown patterns or knowledge from a large text repository. The problem is not easy to tackle due to the semi-structured or even unstructured nature of those texts under consideration. Many approaches have been devised for mining various kinds of knowledge from texts. One important aspect of text mining is on automatic text categorization, which assigns a text document to some predefined category if the document falls into the theme of the category. Traditionally the categories are arranged in hierarchical manner to achieve effective searching and indexing as well as easy comprehension for human beings. The determination of category themes and their hierarchical structures were most done by human experts. In this work, we developed an approach to automatically generate category themes and reveal the hierarchical structure among them. We also used the generated structure to categorize text documents. The document collection was trained by a self-organizing map to form two feature maps. These maps were then analyzed to obtain the category themes and their structure. Although the test corpus contains documents written in Chinese, the proposed approach can be applied to documents written in any language and such documents can be transformed into a list of separated terms.