Concept frequency distribution in biomedical text summarization

Authors:
Lawrence H. Reeve;Hyoil Han;Saya V. Nagori;Jonathan C. Yang;Tamara A. Schwimmer;Ari D. Brooks
Affiliations:
Drexel University, Philadelphia, PA;Drexel University, Philadelphia, PA;Drexel University, Philadelphia, PA;Drexel University, Philadelphia, PA;Drexel University, Philadelphia, PA;Drexel University, Philadelphia, PA
Venue:
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Year:
2006

Citing 11
Cited 3

Applied multivariate techniques

Applied multivariate techniques
The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Summarizing text documents: sentence selection and evaluation metrics

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
New Methods in Automatic Extracting

Journal of the ACM (JACM)
Modern Information Retrieval

Modern Information Retrieval
Document Ranking and the Vector-Space Model

IEEE Software
What Are Ontologies, and Why Do We Need Them?

IEEE Intelligent Systems
Survey of semantic annotation platforms

Proceedings of the 2005 ACM symposium on Applied computing
Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
BioChain: lexical chaining methods for biomedical text summarization

Proceedings of the 2006 ACM symposium on Applied computing
Summarization from medical documents: a survey

Artificial Intelligence in Medicine

The use of domain-specific concepts in biomedical text summarization

Information Processing and Management: an International Journal
CONANN: an online biomedical concept annotator

DILS'07 Proceedings of the 4th international conference on Data integration in the life sciences
Degree centrality for semantic abstraction summarization of therapeutic studies

Journal of Biomedical Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Text summarization is a data reduction process. The use of text summarization enables users to reduce the amount of text that must be read while still assimilating the core information. The data reduction offered by text summarization is particularly useful in the biomedical domain, where physicians must continuously find clinical trial study information to incorporate into their patient treatment efforts. Such efforts are often hampered by the high-volume of publications. Our contribution is two-fold: 1) to propose the frequency of domain concepts as a method to identify important sentences within a full-text; and 2) propose a novel frequency distribution model and algorithm for identifying important sentences based on term or concept frequency distribution. An evaluation of several existing summarization systems using biomedical texts is presented in order to determine a performance baseline. For domain concept comparison, a recent high-performing frequency-based algorithm using terms is adapted to use concepts and evaluated using both terms and concepts. It is shown that the use of concepts performs closely with the use of terms for sentence selection. Our proposed frequency distribution model and algorithm outperforms a state-of-the-art approach.