Query-based sampling of text databases
ACM Transactions on Information Systems (TOIS)
Building efficient and effective metasearch engines
ACM Computing Surveys (CSUR)
ACM Computing Surveys (CSUR)
Integrating database and World Wide Web technologies
World Wide Web
Query processing with quality control in the World Wide Web
World Wide Web
WISE: A World Wide Web Resource Database System
IEEE Transactions on Knowledge and Data Engineering
An Interactive Classification of Web Documents by Self-Organizing Maps and Search Engines
DASFAA '99 Proceedings of the Sixth International Conference on Database Systems for Advanced Applications
Adaptive and Incremental Query Expansion for Cluster-based Browsing
DASFAA '99 Proceedings of the Sixth International Conference on Database Systems for Advanced Applications
Journal of the American Society for Information Science and Technology
Using the structure of HTML documents to improve retrieval
USITS'97 Proceedings of the USENIX Symposium on Internet Technologies and Systems on USENIX Symposium on Internet Technologies and Systems
Search engine ranking efficiency evaluation tool
ACM SIGCSE Bulletin
Foundations and Trends in Web Science
Discovering implicit feedbacks from search engine log files
DS'07 Proceedings of the 10th international conference on Discovery science
EasyQuerier: a keyword based interface for web database integration system
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Automatic knowledge recommending system using e-mail
EC-Web'05 Proceedings of the 6th international conference on E-Commerce and Web Technologies
Web-document filtering using concept graph
ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part IV
Hi-index | 0.00 |
Applying information retrieval techniques to the World Wide Web (WWW) environment is a challenge, mostly because of its hypertext/hypermedia nature and the richness of the meta-information it provides. We present four keyword-based search and ranking algorithms for locating relevant WWW pages with respect to user queries. The first algorithm, Boolean Spreading Activation, extends the notion of word occurrence in the Boolean retrieval model by propagating the occurrence of a query word in a page to other pages linked to it. The second algorithm, Most-cited, uses the number of citing hyperlinks between potentially relevant WWW pages to increase the relevance scores of the referenced pages over the referencing pages. The third algorithm, TFxIDF vector space model, is based on word distribution statistics. The last algorithm, Vector Spreading Activation, combines TFxIDF with the spreading activation model. We conducted an experiment to evaluate the retrieval effectiveness of these algorithms. From the results of the experiment, we draw conclusions regarding the nature of the WWW environment with respect to document ranking strategies.