Text mining: generating hypotheses from MEDLINE
Journal of the American Society for Information Science and Technology
Term identification in the biomedical literature
Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Mining semantically related terms from biomedical literature
ACM Transactions on Asian Language Information Processing (TALIP)
Overview and semantic issues of text mining
ACM SIGMOD Record
A new algorithm for term weighting in text summarization process
AIC'06 Proceedings of the 6th WSEAS International Conference on Applied Informatics and Communications
Using text classification and multiple concepts to answer e-mails
Expert Systems with Applications: An International Journal
A Probabilistic SVM Approach to Annotation of Calcification Mammograms
International Journal of Digital Library Systems
Hi-index | 0.00 |
The choice of features used to represent a domain has a profound effect on the quality of the model produced; yet, few researchers have investigated the relationship between the features used to represent text and the quality of the final model. We explored this relationship formedical texts by comparing association rules based on features with three different semantic levels: (1) words (2) manually assigned keywords and (3) automatically selected medical concepts. Our preliminary findings indicate that bi-directional association rules based onconcepts or keywords are more plausible and more useful than those based on word features. The concept and keyword representations also required 90% fewer features than the word representation. This drastic dimensionality reduction suggests that this approach is well suited to large textual corpus of medical text, such as parts of the Web.