C4.5: programs for machine learning
C4.5: programs for machine learning
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
Data mining: concepts and techniques
Data mining: concepts and techniques
Algorithms for association rule mining — a general survey and comparison
ACM SIGKDD Explorations Newsletter
Modern Information Retrieval
Data Mining: Introductory and Advanced Topics
Data Mining: Introductory and Advanced Topics
Machine Learning
Mining the Web: Discovering Knowledge from HyperText Data
Mining the Web: Discovering Knowledge from HyperText Data
ECML '95 Proceedings of the 8th European Conference on Machine Learning
Classification by Voting Feature Intervals
ECML '97 Proceedings of the 9th European Conference on Machine Learning
Information extraction from biomedical text
Journal of Biomedical Informatics - Special issue: Sublanguage
Combining the language model and inference network approaches to retrieval
Information Processing and Management: an International Journal - Special issue: Bayesian networks and information retrieval
Graph-Theoretic Techniques for Web Content Mining
Graph-Theoretic Techniques for Web Content Mining
Web content outlier mining: motivation, framework, and algorithms
Web content outlier mining: motivation, framework, and algorithms
Semantics-aware matching strategy (SAMS) for the Ontology meDiated Data Integration (ODDI)
International Journal of Knowledge Engineering and Soft Data Paradigms
Hi-index | 0.00 |
In this paper, our goal is to mine biomedical data from hypertext documents e.g., mining data from web contents using data mining algorithms with the help of 'biomedical ontology'. We collect a number of documents using Google and preprocess the hypertext documents and extract the text data. Next job is the identification of biomedical data. To identify whether a word is a biomedical entity or not we use a biomedical database, the 'UMLS metathesaurus'. The mapping of biomedical entity from the metathesaurus will be done based on keyword query. The more occurrence of a biomedical entity in a page, the more relevant the page is, and thus, we can re-rank the documents to find the most important documents. Then we test and analyse the performance of seven most popular classification algorithms by training them separately with the documents ranked by Google and our algorithm.