Elements of information theory
Elements of information theory
Fixing the “broken-link” problem: the W3Objects approach
Proceedings of the fifth international World Wide Web conference on Computer networks and ISDN systems
Effective site finding using link anchor information
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Web page change and persistence---a four-year longitudinal study
Journal of the American Society for Information Science and Technology
Automatic Link Generation and Repair Mechanism for Document Management
HICSS '98 Proceedings of the Thirty-First Annual Hawaii International Conference on System Sciences - Volume 2
Just-in-time recovery of missing web pages
Proceedings of the seventeenth conference on Hypertext and hypermedia
A Tool to Compute ReliableWeb Links and Its Applications
ICDEW '05 Proceedings of the 21st International Conference on Data Engineering Workshops
Introduction to Information Retrieval
Introduction to Information Retrieval
Revisiting Lexical Signatures to (Re-)Discover Web Pages
ECDL '08 Proceedings of the 12th European conference on Research and Advanced Technology for Digital Libraries
Recommendation System for Automatic Recovery of Broken Web Links
IBERAMIA '08 Proceedings of the 11th Ibero-American conference on AI: Advances in Artificial Intelligence
PageChaser: A Tool for the Automatic Correction of Broken Web Links
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Recent developments in information retrieval
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Updating broken web links: An automatic recommendation system
Information Processing and Management: an International Journal
Learning resources in federated environments: a broken link checker based on URL similarity
International Journal of Metadata, Semantics and Ontologies
Hi-index | 0.00 |
In this work we compare different techniques to automatically find candidate web pages to substitute broken links. We extract information from the anchor text, the content of the page containing the link, and the cache page in some digital library. The selected information is processed and submitted to a search engine. We have compared different information retrieval methods for both, the selection of terms used to construct the queries submitted to the search engine, and the ranking of the candidate pages that it provides, in order to help the user to find the best replacement. In particular, we have used term frequencies, and a language model approach for the selection of terms; and cooccurrence measures and a language model approach for ranking the final results. To test the different methods, we have also defined a methodology which does not require the user judgments, what increases the objectivity of the results.