Effective site finding using link anchor information
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
WWW '03 Proceedings of the 12th international conference on World Wide Web
Robust Hyperlinks Cost Just Five Words Each
Robust Hyperlinks Cost Just Five Words Each
Refinement of TF-IDF schemes for web pages using their hyperlinked neighboring pages
Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
Sic transit gloria telae: towards an understanding of the web's decay
Proceedings of the 13th international conference on World Wide Web
Search engine coverage bias: evidence and possible causes
Information Processing and Management: an International Journal
Analysis of lexical signatures for improving information persistence on the World Wide Web
ACM Transactions on Information Systems (TOIS)
Accurately interpreting clickthrough data as implicit feedback
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Identifying "best bet" web search results by mining past user behavior
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Agreeing to disagree: search engines and their public interfaces
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Revisiting Lexical Signatures to (Re-)Discover Web Pages
ECDL '08 Proceedings of the 12th European conference on Research and Advanced Technology for Digital Libraries
Find, new, copy, web, page - tagging for the (re-)discovery of web pages
TPDL'11 Proceedings of the 15th international conference on Theory and practice of digital libraries: research and advanced technology for digital libraries
TPDL'12 Proceedings of the Second international conference on Theory and Practice of Digital Libraries
Reading the correct history?: modeling temporal intention in resource sharing
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
Hi-index | 0.00 |
For discovering the new URI of a missing web page, lexical signatures, which consist of a small number of words chosen to represent the "aboutness" of a page, have been previously proposed. However, prior methods relied on computing the lexical signature before the page was lost, or using cached or archived versions of the page to calculate a lexical signature. We demonstrate a system of constructing a lexical signature for a page from its link neighborhood, that is the "backlinks", or pages that link to the missing page. After testing various methods, we show that one can construct a lexical signature for a missing web page using only ten backlink pages. Further, we show that only the first level of backlinks are useful in this effort. The text that the backlinks use to point to the missing page is used as input for the creation of a four-word lexical signature. That lexical signature is shown to successfully find the target URI in more than half of the test cases.