Health: related information structuring for the semantic web

Authors:
Mohammad Ali H. Eljinini
Affiliations:
Isra University Amman, Jordan
Venue:
Proceedings of the 2011 International Conference on Intelligent Semantic Web-Services and Applications
Year:
2011

Citing 11
Cited 1

Hierarchical concept indexing of full-text documents in the Unified Medical Language System information sources map

Journal of the American Society for Information Science
Information retrieval on the web

ACM Computing Surveys (CSUR)
Web mining research: a survey

ACM SIGKDD Explorations Newsletter
Automating the Construction of Internet Portals with Machine Learning

Information Retrieval
OIL: An Ontology Infrastructure for the Semantic Web

IEEE Intelligent Systems
The Knowledge Model of Protégé-2000: Combining Interoperability and Flexibility

EKAW '00 Proceedings of the 12th European Workshop on Knowledge Acquisition, Modeling and Management
Relational learning techniques for natural language information extraction

Relational learning techniques for natural language information extraction
Extracting unstructured data from template generated web documents

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Exploring semantic groups through visual approaches

Journal of Biomedical Informatics - Special issue: Unified medical language system
Tapping the power of text mining

Communications of the ACM - Privacy and security in highly dynamic systems
CRYSTAL inducing a conceptual dictionary

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2

The Medical Semantic Web: Opportunities and Issues

International Journal of Information Technology and Web Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

The World Wide Web has become an important medium for the dissemination of information related to a wide range of topics. The majority of human information is becoming available on the web very rapidly. In the medical domain, the number of documents related to healthcare is already large and continues to grow at an exponential rate. Most information on the web is buried inside HTML documents which are designed for human consumption. Restructuring information automatically into machine understandable form and making it available to web search agents would bring the web to its full potential. In this work we have downloaded a set of 100 diabetes-related websites, over 12000 HTML files, which have been carefully analyzed. Our intention is first to learn the general structure of these websites which would increase the efficiency of information extraction and structuring. Every website has a purpose mainly providing services or products (or both). Our study resulted in the construction of an ontology covering a set of general services and products that these websites offer. The main goal of such ontology is to provide guidance in the process of extracting and structuring information. We incorporated the Unified Medical Language System (UMLS) Semantic Net which serves as an upper level ontology for medicine. We used the MetaMap Transfer (MMTx) API developed by the US National Library of Medicine (NLM) for mapping text into concepts from the UMLS Semantic Net. Pinpointing concepts in web pages provides an efficient way to determine the attributes and therefore facilitates more efficient extraction and restructuring of information. This paper describes the first part of our work and findings.