On the use of information retrieval techniques for the automatic construction of hypertext
Information Processing and Management: an International Journal - Special issue: methods and tools for the automatic construction of hypertext
Self-organizing maps
A Web text mining approach based on self-organizing map
Proceedings of the 2nd international workshop on Web information and data management
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Hi-index | 0.00 |
In this work we developed a new automatic hypertext construction method based on a proposed text mining approach. Our method applies the self-organizing map algorithm to cluster some flat text documents in a training corpus and generate two maps. We then use these maps to identify the sources and destinations of some important hyperlinks within these training documents. The constructed hyperlinks are then inserted into the training documents to translate them into hypertext form. Such translated documents form the new corpus. Incoming documents can also be translated into hypertext form and added to the corpus through the same approach. Our method had been tested on a set of flat text documents collecting from several newswire sites. Although we only used Chinese text documents, our approach can be applied to any document that can be transformed to a set of indexed terms.