Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
An outline of a general model for information retrieval systems
SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
An analysis of vector space models based on computational geometry
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
BUS: an effective indexing and retrieval scheme in structured documents
Proceedings of the third ACM conference on Digital libraries
WWW '99 Proceedings of the eighth international conference on World Wide Web
A probabilistic description-oriented approach for categorizing web documents
Proceedings of the eighth international conference on Information and knowledge management
Integrating keyword search into XML query processing
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
WISE: A World Wide Web Resource Database System
IEEE Transactions on Knowledge and Data Engineering
Hi-index | 0.00 |
This chapter introduces the process to retrieve units (or subdocuments) of relevant information from XML documents. For this, we use the Extensible Markup Language (XML) which is considered as a new standard for data representation and exchange on the Web. XML opens opportunities to develop a new generation of Information Retrieval System (IRS) to improve the interrogation process of document bases on the Web.Our work focuses instead on end-users who do not have expertise in the domain (like a majority of the end-users). This approach supports keyword-based searching like classical IRS and integrates structured searching with the search attributes notion. It is based on an indexing method of document tree leafs which authorize a content-oriented retrieval. The retrieval subdocuments are ranked according to their similarity with the user's query. We use a similarity measure which is a compromise between two measures: exhaustiveness and specificity.