An exploration of axiomatic approaches to information retrieval
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Similarity measures for tracking information flow
Proceedings of the 14th ACM international conference on Information and knowledge management
A web-based kernel function for measuring the similarity of short text snippets
Proceedings of the 15th international conference on World Wide Web
Sentence Similarity Based on Semantic Nets and Corpus Statistics
IEEE Transactions on Knowledge and Data Engineering
Improving similarity measures for short segments of text
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Efficient linear text segmentation based on information retrieval techniques
Proceedings of the International Conference on Management of Emergent Digital EcoSystems
A comparative study of two short text semantic similarity measures
KES-AMSTA'08 Proceedings of the 2nd KES International conference on Agent and multi-agent systems: technologies and applications
Linking archives using document enrichment and term selection
TPDL'11 Proceedings of the 15th international conference on Theory and practice of digital libraries: research and advanced technology for digital libraries
Discovering links between political debates and media
ICWE'13 Proceedings of the 13th international conference on Web Engineering
Hi-index | 0.00 |
Collaboratively created online encyclopedias have become increasingly popular. Especially in terms of completeness they have begun to surpass their printed counterparts. Two German publishers of traditional encyclopedias have reacted to this challenge and decided to merge their corpora to create a single more complete encyclopedia. The crucial step in this merge process is the alignment of articles. We have developed a system to identify corresponding entries from different encyclopedic corpora. The base of our system is the alignment algorithm which incorporates various techniques developed in the field of information retrieval. We have evaluated the system on four real-world encyclopedias with a ground truth provided by domain experts. A combination of weighting and ranking techniques has been found to deliver a satisfying performance.