Integrated term weighting, visualization, and user interface development for bioinformation retrieval

Authors:
Min Hong;Anis Karimpour-Fard;Steve Russell;Lawrence Hunter
Affiliations:
Bioinformatics, University of Colorado Health Sciences Center, Denver, CO;Bioinformatics, University of Colorado Health Sciences Center, Denver, CO;Bioinformatics, University of Colorado Health Sciences Center, Denver, CO;Bioinformatics, University of Colorado Health Sciences Center, Denver, CO
Venue:
AIS'04 Proceedings of the 13th international conference on AI, Simulation, and Planning in High Autonomy Systems
Year:
2004

Citing 10
Cited 0

An algorithm for the calculation of exact term discrimination values

Information Processing and Management: an International Journal
Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Automatic text processing: the transformation, analysis, and retrieval of information by computer

Automatic text processing: the transformation, analysis, and retrieval of information by computer
SAPHIRE—an information retrieval system featuring concept matching, automatic indexing, probabilistic retrieval, and hierarchical relationships

Computers and Biomedical Research
Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Computational Methods for Intelligent Information Access

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
A vector space model for automatic indexing

Communications of the ACM
Vector-space ranking with effective early termination

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic Extraction of Biological Information from Scientific Text: Protein-Protein Interactions

Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
The SMART Retrieval System—Experiments in Automatic Document Processing

The SMART Retrieval System—Experiments in Automatic Document Processing

Quantified Score

Hi-index	0.01

Visualization

Abstract

This project implements an integrated biological information website that classifies technical documents, learns about users' interests, and offers intuitive interactive visualization to navigate vast information spaces. The effective use of modern software engineering principles, system environments, and development approaches is demonstrated. Straightforward yet powerful document characterization strategies are illustrated, helpful visualization for effective knowledge transfer is shown, and current user interface methodologies are applied. A specific success of note is the collaboration of disparately skilled specialists to deliver a flexible integrated prototype in a rapid manner that meets user acceptance and performance goals. The domain chosen for the demonstration is breast cancer, using a corpus of abstracts from publications obtained online from Medline. The terms in the abstracts are extracted by word stemming and a stop list, and are encoded in vectors. A TF-IDF technique is implemented to calculate similarity scores between a set of documents and a query. Polysemy and synonyms are explicitly addressed. Groups of related and useful documents are identified using interactive visual displays such as a spiral graph that represents of the overall similarity of documents. K-means clustering of the similarities among a document set is used to display a 3-D relationship map. User identities are established and updated by observing the patterns of terms used in their queries, and from login site locations. Explicit considerations of changing user category profiles, site stakeholders, information modeling, and networked technologies are pointed out.