Fractal views: a fractal-based method for controlling information display
ACM Transactions on Information Systems (TOIS)
A trainable document summarizer
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
KEA: practical automatic keyphrase extraction
Proceedings of the fourth ACM conference on Digital libraries
New Methods in Automatic Extracting
Journal of the ACM (JACM)
OCELOT: a system for summarizing Web pages
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Automatically summarising Web sites: is there a way around it?
Proceedings of the ninth international conference on Information and knowledge management
Seeing the whole in parts: text summarization for web browsing on handheld devices
Proceedings of the 10th international conference on World Wide Web
Learning Algorithms for Keyphrase Extraction
Information Retrieval
Centroid-based summarization of multiple documents
Information Processing and Management: an International Journal
Mining web site's topic hierarchy
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
A link classification based approach to website topic hierarchy generation
Proceedings of the 16th international conference on World Wide Web
Hierarchical summarization of large documents
Journal of the American Society for Information Science and Technology
Web site topic-hierarchy generation based on link structure
Journal of the American Society for Information Science and Technology
Hi-index | 0.00 |
Looking for web pages to identify useful information from a website is tedious and time consuming. Search engines are not always helpful due to the vocabulary difference between queries and web pages. Users may also have difficulty to accurately represent their information needs as queries at the beginning of exploration stage. A site map of website provides an outline of the overall structure of website. Without navigating through the website from the root page, users can easily identify the exact webpage to extract useful information to satisfy their information needs. However, site maps are not always available. In our previous work, we develop techniques to generate a website topic hierarchy. In this paper, we extend our work to extract keyphrases to label the web site topic hierarchy. The keyphrases serve in the purpose of summarizing the content so that users can efficiently browse through the site map to pin point the web page that provides the useful information they need. In the proposed keyphrase extraction, there are three major components. The first component is the candidate phrases identification. The second component computes the feature scores for summarization. The features include thematic and presentation features. The third component extracts the keyphrases by combining the feature scores. We have conducted an experiment and obtained promising result.