Analysis of mammography reports using maximum variation sampling

Authors:
Robert M. Patton;Barbara Beckerman;Thomas E. Potok
Affiliations:
Oak Ridge National Laboratory, Oak Ridge, TN, USA;Oak Ridge National Laboratory, Oak Ridge, TN, USA;Oak Ridge National Laboratory, Oak Ridge, TN, USA
Venue:
Proceedings of the 10th annual conference companion on Genetic and evolutionary computation
Year:
2008

Citing 11
Cited 4

Lexical analysis and stoplists

Information retrieval
Solving combinatorial optimization problems using parallel simulated annealing and parallel genetic algorithms

SAC '92 Proceedings of the 1992 ACM/SIGAPP symposium on Applied computing: technological challenges of the 1990's
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Genetic Algorithms in Search, Optimization and Machine Learning

Genetic Algorithms in Search, Optimization and Machine Learning
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Parallel Genetic Algorithms Population Genetics and Combinatorial Optimization

Proceedings of the 3rd International Conference on Genetic Algorithms
Distributed genetic algorithms for function optimization

Distributed genetic algorithms for function optimization
Semantic-based information retrieval of biomedical data

Proceedings of the 2006 ACM symposium on Applied computing
Characterizing large text corpora using a maximum variation sampling genetic algorithm

Proceedings of the 8th annual conference on Genetic and evolutionary computation
TF-ICF: A New Term Weighting Scheme for Clustering Dynamic Data Streams

ICMLA '06 Proceedings of the 5th International Conference on Machine Learning and Applications
A distributed agent implementation of multiple species flocking model for document partitioning clustering

CIA'06 Proceedings of the 10th international conference on Cooperative Information Agents

A genetic algorithm for learning significant phrase patterns in radiology reports

Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers
Architecture-level dependability analysis of a medical decision support system

Proceedings of the 2010 ICSE Workshop on Software Engineering in Health Care
Genetic algorithm for analysis of abdominal aortic aneurysms in radiology reports

Proceedings of the 12th annual conference companion on Genetic and evolutionary computation
Discovering potential precursors of mammography abnormalities based on textual features, frequencies, and sequences

ICAISC'10 Proceedings of the 10th international conference on Artificial intelligence and soft computing: Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

A genetic algorithm (GA) was developed to implement a maximum variation sampling technique to derive a subset of data from a large dataset of unstructured mammography reports. It is well known that a genetic algorithm performs very well for large search spaces and is easily scalable to the size of the data set. In mammography, much effort has been expended to characterize findings in the radiology reports. Existing computer-assisted technologies for mammography are based on machine-learning algorithms that must learn against a training set with known pathologies in order to further refine the algorithms with higher validity of truth. In a large database of reports and corresponding images, automated tools are needed just to determine which data to include in the training set. This work presents preliminary results showing the use of a GA for finding abnormal reports without a training set. The underlying premise is that abnormal reports should consist of unusual or rare words, thereby making the reports very dissimilar in comparison to other reports. A genetic algorithm was developed to test this hypothesis, and preliminary results show that most abnormal reports in a test set are found and can be adequately differentiated.