Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
The indexable web is more than 11.5 billion pages
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Lexical and semantic clustering by web links
Journal of the American Society for Information Science and Technology - Special issue: Webometrics
Hi-index | 0.00 |
In this paper, we describe a method to semi-automatically extract Topic Maps from a set of Web pages. We introduce the following two points to the existing clustering method: The first is merging only the linked Web pages, to extract the underlying relationship of the topics. The second is introducing the similarity by contents of Web pages and the types of links, and the distance between the directories in which the pages are located, to generate dense clusters. We generate the topic map by assuming the clusters as topics, the edges as associations, the Web pages related to the topic as occurrences from the result of clustering. We experimentally extracted the topic map and evaluated it.