Refinement of TF-IDF schemes for web pages using their hyperlinked neighboring pages
Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
Accurately interpreting clickthrough data as implicit feedback
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Identifying "best bet" web search results by mining past user behavior
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Optimizing web search using social annotations
Proceedings of the 16th international conference on World Wide Web
Can social bookmarking enhance search in the web?
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Can social bookmarking improve web search?
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Tagging and searching: Search retrieval effectiveness of folksonomies on the World Wide Web
Information Processing and Management: an International Journal
Revisiting Lexical Signatures to (Re-)Discover Web Pages
ECDL '08 Proceedings of the 12th European conference on Research and Advanced Technology for Digital Libraries
Can all tags be used for search?
Proceedings of the 17th ACM conference on Information and knowledge management
A comparison of social bookmarking with traditional search
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Proceedings of the 21st ACM conference on Hypertext and hypermedia
Evaluating methods to rediscover missing web pages from the web infrastructure
Proceedings of the 10th annual joint conference on Digital libraries
Rediscovering missing web pages using link neighborhood lexical signatures
Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
Reading the correct history?: modeling temporal intention in resource sharing
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
Hi-index | 0.00 |
The World Wide Web has a very dynamic character with resources constantly disappearing and (re-)surfacing. A ubiquitous result is the "404 Page not Found" error as the request for missing web pages. We investigate tags obtained from Delicious for the purpose of rediscovering such missing web pages with the help of search engines. We determine the best performing tag based query length, quantify the relevance of the results and compare tags to retrieval methods based on a page's content. We find that tags are only useful in addition to content based methods. We further introduce the notion of "ghost tags", terms used as tags that do not occur in the current but did occur in a previous version of the web page. One third of these ghost tags are ranked high in Delicious and also occurred frequently in the document which indicates their importance to both the user and the content of the document.