Efficient retrieval of partial documents
TREC-2 Proceedings of the second conference on Text retrieval conference
Web Design with XML
WISDOM: Web Intrapage Informative Structure Mining Based on Document Object Model
IEEE Transactions on Knowledge and Data Engineering
Using web structure and summarisation techniques for web content mining
Information Processing and Management: an International Journal
A model for document management
Computer Communications
Characterising the requirements of engineering information systems
International Journal of Information Management: The Journal for Information Professionals
Research and Implementation of Self-Publishing Website Platforms for Universities Based on CMS
International Journal of Advanced Pervasive and Ubiquitous Computing
Hi-index | 0.00 |
This paper discusses engineering document fragment mark-up supported by the use of the eXstensible Stylesheet Language – Formatting Objects (XLS-FO). XLS-FO can be used to convert the native format repre-sentation of such documents as Word, Excel and PDF into XML. Once in XML, documents fragments can be retrieved at will in response to a search query. In the paper the process of a document fragment retrieval – based on the authors’ decomposition scheme approach – has been modelled and the issue of converting documents into XML addressed. Additionally, the use of document templates is discussed as a means of ensuring that the transformed XML documents are compliant with the decomposition schemes. Automating the reformatting of documents into XML and the use of templates helps make implementation of a document-fragment approach to retrieval more resource efficient, so making its adoption in industry more practicable.