Information extraction for enhanced access to disease outbreak reports

Authors:
Ralph Grishman;Silja Huttunen;Roman Yangarber
Affiliations:
Computer Science Department, Courant Institute of Mathematical Sciences, New York University, New York, NY;Computer Science Department, Courant Institute of Mathematical Sciences, New York University, New York, NY;Computer Science Department, Courant Institute of Mathematical Sciences, New York University, New York, NY
Venue:
Journal of Biomedical Informatics - Special issue: Sublanguage
Year:
2002

Citing 9
Cited 22

Medical Language Processing: Computer Management of Narrative Data

Medical Language Processing: Computer Management of Narrative Data
Scenario customization for information extraction

Scenario customization for information extraction
Message Understanding Conference-6: a brief history

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Automatic acquisition of domain knowledge for Information Extraction

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
A self-learning universal concept spotter

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Unsupervised learning of generalized names

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Complexity of event structure in IE scenarios

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
The NYU system for MUC-6 or where's the syntax?

MUC6 '95 Proceedings of the 6th conference on Message understanding
Automatically generating extraction patterns from untagged text

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2

The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text

Journal of Biomedical Informatics - Special issue: Unified medical language system
To search or to crawl?: towards a query optimizer for text-centric tasks

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Catching web crawlers in the act

ICWE '06 Proceedings of the 6th international conference on Web engineering
Extending the event calculus for tracking epidemic spread

Artificial Intelligence in Medicine
Redundancy-based correction of automatically extracted facts

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Extracting information about outbreaks of infectious epidemics

HLT-Demo '05 Proceedings of HLT/EMNLP on Interactive Demonstrations
Towards a query optimizer for text-centric tasks

ACM Transactions on Database Systems (TODS)
Information Extraction

Foundations and Trends in Databases
A quality-aware optimizer for information extraction

ACM Transactions on Database Systems (TODS)
The development of a schema for semantic annotation: Gain brought by a formal ontological method

Applied Ontology - Biomedical Ontology in Action
Exploiting subjectivity classification to improve information extraction

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
Assessment of utility in web mining for the domain of public health

Louhi '10 Proceedings of the NAACL HLT 2010 Second Louhi Workshop on Text and Data Mining of Health Documents
Unsupervised public health event detection for epidemic intelligence

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Real-time text mining in multilingual news for the creation of a pre-frontier intelligence picture

ACM SIGKDD Workshop on Intelligence and Security Informatics
Towards detecting influenza epidemics by analyzing Twitter messages

Proceedings of the First Workshop on Social Media Analytics
Detecting health events on the social web to enable epidemic intelligence

SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Lightweight methods to estimate influenza rates and alcohol sales volume from Twitter messages

Language Resources and Evaluation
The picture of health: map-based, collaborative spatio-temporal disease tracking

Proceedings of the First ACM SIGSPATIAL International Workshop on Use of GIS in Public Health
Automatic Drug Adverse Reaction Discovery from Parenting Websites Using Disproportionality Methods

ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Beyond search: Retrieving complete tuples from a text-database

Information Systems Frontiers
When speed has a price: fast information extraction using approximate algorithms

Proceedings of the VLDB Endowment
Extraction of disease events for a real-time monitoring system

Proceedings of the Fourth Symposium on Information and Communication Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Document search is generally based on individual terms in the document. However, for collections within limited domains it is possible to provide more powerful access tools. This paper describes a system designed for collections of reports of infectious disease outbreaks. The system, Proteus-BIO, automatically creates a table of outbreaks, with each table entry linked to the document describing that outbreak; this makes it possible to use database operations such as selection and sorting to find relevant documents. Proteus-BIO consists of a Web crawler which gathers relevant documents; an information extraction engine which converts the individual outbreak events to a tabular database; and a database browser which provides access to the events and, through them, to the documents. The information extraction engine uses sets of patterns and word classes to extract the information about each event. Preparing these patterns and word classes has been a time-consuming manual operation in the past, but automated discovery tools now make this task significantly easier. A small study comparing the effectiveness of the tabular index with conventional Web search tools demonstrated that users can find substantially more documents in a given time period with Proteus-BIO.