What Are Ontologies, and Why Do We Need Them?
IEEE Intelligent Systems
Text mining in a digital library
International Journal on Digital Libraries
A Survey of Web Information Extraction Systems
IEEE Transactions on Knowledge and Data Engineering
Integrating data and text mining processes for digital library applications
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
A metadata generation system for scanned scientific volumes
Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
Incorporating site-level knowledge to extract structured data from web forums
Proceedings of the 18th international conference on World wide web
Open information extraction from the web
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
SVM based learning system for information extraction
Proceedings of the First international conference on Deterministic and Statistical Methods in Machine Learning
Hi-index | 0.00 |
With increasing amount of digitized content, it is possible now for digital libraries to provide ambient information rather than simple query results. This paper presents a content integration method that extracts semi-structured information from textbooks and integrates relevant content based on the relations extracted. We apply the method on a specified domain, traditional Chinese medicine (TCM), and operate on an operational public digital library, China America Digital Academic Library (CADAL). The extraction focuses on core TCM entities (herbal medicines, prescriptions, medical masters etc.) and their detailed properties. After that, we establish basic relations of different entities with domain ontology and resource metadata. Besides, latent relations, which describe semantic correlations, are also discovered via an entity parameterization procedure. Finally, integration is performed according to the relations discovered. A system is presented to show the encouraging improvement and library experience. Our method is a practical exploration to integrate digital resources and to promote library services in a feasible way.