Find, new, copy, web, page - tagging for the (re-)discovery of web pages

Authors:
Martin Klein;Michael L. Nelson
Affiliations:
Old Dominion University, Department of Computer Science, Norfolk, VA;Old Dominion University, Department of Computer Science, Norfolk, VA
Venue:
TPDL'11 Proceedings of the 15th international conference on Theory and practice of digital libraries: research and advanced technology for digital libraries
Year:
2011

Citing 13
Cited 1

Refinement of TF-IDF schemes for web pages using their hyperlinked neighboring pages

Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
Accurately interpreting clickthrough data as implicit feedback

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Identifying "best bet" web search results by mining past user behavior

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Optimizing web search using social annotations

Proceedings of the 16th international conference on World Wide Web
Can social bookmarking enhance search in the web?

Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Can social bookmarking improve web search?

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Tagging and searching: Search retrieval effectiveness of folksonomies on the World Wide Web

Information Processing and Management: an International Journal
Revisiting Lexical Signatures to (Re-)Discover Web Pages

ECDL '08 Proceedings of the 12th European conference on Research and Advanced Technology for Digital Libraries
Can all tags be used for search?

Proceedings of the 17th ACM conference on Information and knowledge management
A comparison of social bookmarking with traditional search

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Is this a good title?

Proceedings of the 21st ACM conference on Hypertext and hypermedia
Evaluating methods to rediscover missing web pages from the web infrastructure

Proceedings of the 10th annual joint conference on Digital libraries
Rediscovering missing web pages using link neighborhood lexical signatures

Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries

Reading the correct history?: modeling temporal intention in resource sharing

Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries

Quantified Score

Hi-index	0.00

Visualization

Abstract

The World Wide Web has a very dynamic character with resources constantly disappearing and (re-)surfacing. A ubiquitous result is the "404 Page not Found" error as the request for missing web pages. We investigate tags obtained from Delicious for the purpose of rediscovering such missing web pages with the help of search engines. We determine the best performing tag based query length, quantify the relevance of the results and compare tags to retrieval methods based on a page's content. We find that tags are only useful in addition to content based methods. We further introduce the notion of "ghost tags", terms used as tags that do not occur in the current but did occur in a previous version of the web page. One third of these ghost tags are ranked high in Delicious and also occurred frequently in the document which indicates their importance to both the user and the content of the document.