Self-organizing maps
Mining Text Using Keyword Distributions
Journal of Intelligent Information Systems
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Automatic Category Structure Generation and Categorization of Chinese Text Documents
PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Automatic Hypertext Construction through a Text Mining Approach by Self-Organizing Maps
PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
Automatic Category Theme Identification and Hierarchy Generation for Chinese Text Categorization
Journal of Intelligent Information Systems
A text mining approach for automatic construction of hypertexts
Expert Systems with Applications: An International Journal
Multilingual document mining and navigation using self-organizing maps
Information Processing and Management: an International Journal
Hi-index | 0.00 |
Web text mining is a new issue in the knowledge discovery research field. It is aimed to help people discover knowledge from large quantities of semi-structured or unstructured text in the web. Several approaches, including some pure and hybrid information retrieval (IR) methods, have been proposed to tackle such an issue. Among these approaches, combining the Self-Organizing Map (SOM) method with the principles of the vectorspace model, appears to be a promising alternative for the traditional purely IR-based methods in this problem domain. In this paper, a novel SOM-based method using a Chinese corpus for web text mining is presented. The SOM is used to generate two maps, namely the word cluster map and the document cluster map, which reveal the relationships among words and documents respectively. The search process incorporates these two maps and effectively finds the relevant documents according to the keywords specified in the query. The conceptually associated web documents are found not only by the specific keywords but the relevant words found by the word cluster map.