Information extraction, data mining and joint inference

Authors:
Andrew McCallum
Affiliations:
University of Massachusetts, Amherst, MA
Venue:
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2006

Citing 0
Cited 3

Challenges from information extraction to information fusion

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Joint inference for cross-document information extraction

Proceedings of the 20th ACM international conference on Information and knowledge management
Data in social network analysis

ICCMSN'08 Proceedings of the First international conference on Computer-Mediated Social Networking

Quantified Score

Hi-index	0.00

Visualization

Abstract

Although information extraction and data mining appear together in many applications, their interface in most current systems would better be described as serial juxtaposition than as tight integration. Information extraction populates slots in a database by identifying relevant subsequences of text, but is usually not aware of the emerging patterns and regularities in the database. Data mining methods begin from a populated database, and are often unaware of where the data came from, or its inherent uncertainties. The result is that the accuracy of both suffers, and accurate mining of complex text sources has been beyond reach.In this talk I will describe work in probabilistic models that perform joint inference across multiple components of an information processing pipeline in order to avoid the brittle accumulation of errors. After briefly introducing conditional random fields, I will describe recent work in information extraction leveraging factorial state representations, entity resolution, and transfer learning, as well as scalable methods of inference and learning. I'll close with some recent work on probabilistic models for social network analysis, and a demonstration of Rexa.info, a new research paper search engine.