A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery
Automatic document metadata extraction using support vector machines
Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries
Semantic web applications to e-science in silico experiments
Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
Named graphs, provenance and trust
WWW '05 Proceedings of the 14th international conference on World Wide Web
Fedora: an architecture for complex objects and their relationships
International Journal on Digital Libraries
FRBR: enriching and integrating digital libraries
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Automatic extraction of table metadata from digital documents
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Mining, indexing, and searching for textual chemical molecule information on the web
Proceedings of the 17th international conference on World Wide Web
Automatic extraction of data points and text blocks from 2-dimensional plots in digital documents
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
NLP support for faceted navigation in scholarly collections
NLPIR4DL '09 Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries
WebApps'10 Proceedings of the 2010 USENIX conference on Web application development
JeromeDL – adding semantic web technologies to digital libraries
DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications
SimDL: a model ontology driven digital library for simulation systems
Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
IPKB: a digital library for invertebrate paleontology
Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
Hi-index | 0.00 |
Representing the semantics of unstructured scientific publications will certainly facilitate access and search and hopefully lead to new discoveries. However, current digital libraries are usually limited to classic flat structured metadata even for scientific publications that potentially contain rich semantic metadata. In addition, how to search the scientific literature of linked semantic metadata is an open problem. We have developed a semantic digital library oreChem ChemxSeer that models chemistry papers with semantic metadata. It stores and indexes extracted metadata from a chemistry paper repository Chemx Seer using "compound objects". We use the Open Archives Initiative Object Reuse and Exchange (OAI-ORE) (http://www.openarchives.org/ore/ standard to define a compound object that aggregates metadata fields related to a digital object. Aggregated metadata can be managed and retrieved easily as one unit resulting in improved ease-of-use and has the potential to improve the semantic interpretation of shared data. We show how metadata can be extracted from documents and aggregated using OAI-ORE. ORE objects are created on demand; thus, we are able to search for a set of linked metadata with one query. We were also able to model new types of metadata easily. For example, chemists are especially interested in finding information related to experiments in documents. We show how paragraphs containing experiment information in chemistry papers can be extracted and tagged based on a chemistry ontology with 470 classes, and then represented in ORE along with other document-related metadata. Our algorithm uses a classifier with features that are words that are typically only used to describe experiments, such as "apparatus", "prepare", etc. Using a dataset comprised of documents from the Royal Society of Chemistry digital library, we show that the our proposed methodperforms well in extracting experiment-related paragraphs from chemistry documents.