Journal of Chemical Information & Computer Sciences
An annotation scheme for discourse-level argumentation in research articles
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Extraction and search of chemical formulae in text documents on the web
Proceedings of the 16th international conference on World Wide Web
Mining, indexing, and searching for textual chemical molecule information on the web
Proceedings of the 17th international conference on World Wide Web
Semantic annotation of papers: interface & enrichment tool (SAPIENT)
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
High-Throughput identification of chemistry in life science texts
CompLife'06 Proceedings of the Second international conference on Computational Life Sciences
Using Wikipedia categories for compact representations of chemical documents
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Taking chemistry to the task: personalized queries for chemical digital libraries
Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
Catching the drift --- indexing implicit knowledge in chemical digital libraries
TPDL'12 Proceedings of the Second international conference on Theory and Practice of Digital Libraries
Hi-index | 0.00 |
In recent years, the vast amount of digitally available content has lead to the creation of many topic-centered digital libraries. Also in the domain of chemistry more and more digital collections are available, but the complex query formulation still hampers their intuitive adoption. This is because information seeking in chemical documents is focused on chemical entities, for which current standard search relies on complex structures which are hard to extract from documents. Moreover, although simple keyword searches would often be sufficient, current collections simply cannot be indexed by Web search providers due to the ambiguity of chemical substance names. In this paper we present a framework for automatically generating metadata-enriched index pages for all documents in a given chemical collection. All information is then linked to the respective documents and thus provides an easy to crawl metadata repository promising to open up digital chemical libraries. Our experiments, indexing an open access journal, show that not only the documents can be found using a simple Google search via the automatically created index pages, but also that the quality of the search is much more efficient than fulltext indexing in terms of both precision/recall and performance. Finally, we compare our indexing against a classical structure search and figured out that keyword-based search can indeed solve at least some of the daily tasks in chemical workflows. To use our framework thus promises to expose a large part of the currently still hidden chemical Web, making the techniques employed interesting for chemical information providers like digital libraries and open access journals.