Automatic generation and use of negative terms to evaluate topic-related web pages

Authors:
Young-Tae Byun;Yong-Ho Choi;Kee-Cheol Lee
Affiliations:
Department of Computer Engineering Hong-Ik University, Seoul, Korea;Cyber Terror Response Center Korean National Police Agency, Seoul, Korea;Department of Computer Engineering Hong-Ik University, Seoul, Korea
Venue:
HSI'05 Proceedings of the 3rd international conference on Human Society@Internet: web and Communication Technologies and Internet-Related Social Issues
Year:
2005

Citing 8
Cited 0

Automatic resource compilation by analyzing hyperlink structure and associated text

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Focused crawling: a new approach to topic-specific Web resource discovery

WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Effective Automatic Indexing Using Term Addition and Deletion

Journal of the ACM (JACM)
Evaluating topic-driven web crawlers

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval

Modern Information Retrieval
MySpiders: Evolve Your Own Intelligent Web Crawlers

Autonomous Agents and Multi-Agent Systems
Topical web crawlers: Evaluating adaptive algorithms

ACM Transactions on Internet Technology (TOIT)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Deciding the relevance of Web pages to a query or a topic is very important in serving Web users. For clustering and classifying Web pages the similar decisions need to be made. Most of work usually uses positively related terms in one form or another. Once a topic is given or focused, we suggest using negative terms to the topic for the relevance decision. A method to generate negative terms automatically by using DMOZ, Google and WordNet, is discussed, and formulas to decide the relevance using the negative terms are also given in this paper. Experiments convince us of the usefulness of the negative terms against the topic. This work also helps to solve the polysemy problem. Since generating negative terms to any topic is automatic, this work may help many studies for the service improvement in the Web.