Data representation and algorithms for biomedical informatics applications

Authors:
Lucila Ohno-Machado;Griffin M. Weber
Affiliations:
Harvard University;Harvard University
Venue:
Data representation and algorithms for biomedical informatics applications
Year:
2005

Citing 0
Cited 1

Representation in stochastic search for phylogenetic tree reconstruction

Journal of Biomedical Informatics - Special issue: Phylogenetic inferencing: Beyond biology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Biomedical informatics is an emerging field at the intersection of computer science, biology, and medicine. The types of problems encountered in biomedical informatics are not new to computer science. These include representation of data, natural language processing, cluster analysis, and algorithm optimization. However, recently they have become far more important to the life sciences. This is a direct result of major technological advances in the biological sciences and medical research. New DNA sequencing tools can decode the entire genome of an organism, generating gigabytes of data that must be analyzed. Web portals provide physicians a means to communicate with each other, but the algorithms that push content to users must be fast and flexible enough to support the many roles a doctor might have. Digital libraries give scientists instant online access to thousands of journals, but advanced search techniques are required or else users can be swamped with information. This thesis explores five challenging computer science problems that all have applications in biomedical informatics: (1) performing efficient joins on set-valued data fields, (2) automatically assigning summary keywords to text-based documents, (3) using natural language processing to generate human-like dialog, (4) quickly finding the Hamming distance between sequences, and (5) improving stochastic hill-climbing algorithms for phylogenetic tree reconstruction by representing the data as an ordering of taxa.