Information Extraction: Techniques and Challenges
SCIE '97 International Summer School on Information Extraction: A Multidisciplinary Approach to an Emerging Information Technology
The NYU system for MUC-6 or where's the syntax?
MUC6 '95 Proceedings of the 6th conference on Message understanding
Real-time event extraction for infectious disease outbreaks
HLT '02 Proceedings of the second international conference on Human Language Technology Research
Information extraction for enhanced access to disease outbreak reports
Journal of Biomedical Informatics - Special issue: Sublanguage
Predicting accuracy of extracting information from unstructured text collections
Proceedings of the 14th ACM international conference on Information and knowledge management
Information extraction from single and multiple sentences
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Redundancy-based correction of automatically extracted facts
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Automatic creation of domain templates
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Discovering event evidence amid massive, dynamic datasets
Proceedings of the 9th annual conference companion on Genetic and evolutionary computation
Assessment of utility in web mining for the domain of public health
Louhi '10 Proceedings of the NAACL HLT 2010 Second Louhi Workshop on Text and Data Mining of Health Documents
Hi-index | 0.00 |
This paper presents new Information Extraction scenarios which are linguistically and structurally more challenging than the traditional MUC scenarios. Traditional views on event structure and template design are not adequate for the more complex scenarios.The focus of this paper is to show the complexity of the scenarios, and propose a way to recover the structure of the event. First we identify two structural factors that contribute to the complexity of scenarios: the scattering of events in text, and inclusion relationships between events. These factors cause difficulty in representing the facts in an unambiguous way. Then we propose a modular, hierarchical representation where the information is split in atomic units represented by templates, and where the inclusion relationships between the units are indicated by links. Lastly, we discuss how we may recover this representation from text, with the help of linguistic cues linking the events.