Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Automatic resource compilation by analyzing hyperlink structure and associated text
WWW7 Proceedings of the seventh international conference on World Wide Web 7
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Automatically summarising Web sites: is there a way around it?
Proceedings of the ninth international conference on Information and knowledge management
Using web structure for classifying and describing web pages
Proceedings of the 11th international conference on World Wide Web
Enhanced web document summarization using hyperlinks
Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
Automatic categorization of web sites based on source types
Proceedings of the fifteenth ACM conference on Hypertext and hypermedia
Extracting Related Words from Anchor Text Clusters by Focusing on the Page Designer's Intention
DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Hi-index | 0.00 |
Semantic Text Portion (STP) is a text portion in the original page which is semantically related to the anchor pointing to the target page. STPs may include the facts and the people's opinions about the target pages. STPs can be used for various upper-level applications such as automatic summarization and document categorization. In this paper, we concentrate on extracting STPs. We conduct a survey of STP to see the positions of STPs in original pages and find out HTML tags which can divide STPs from the other text portions in original pages. We then develop a method for extracting STPs based on the result of the survey. The experimental results show that our method achieves high performance.