Probabilistic combination of text classifiers using reliability indicators: models and results
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Active Learning for Natural Language Parsing and Information Extraction
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Discriminative Reranking for Natural Language Parsing
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Active Hidden Markov Models for Information Extraction
IDA '01 Proceedings of the 4th International Conference on Advances in Intelligent Data Analysis
Confidence estimation for translation prediction
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Interactive information extraction with constrained conditional random fields
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Learning metadata from the evidence in an on-line citation matching scheme
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Confidence estimation for NLP applications
ACM Transactions on Speech and Language Processing (TSLP)
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Towards a SVM-struct Based Active Learning Algorithm for Least Cost Semantic Annotation
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Cascaded classifiers for confidence-based chemical named entity recognition
BioNLP '08 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Automatic selection of high quality parses created by a fully unsupervised parser
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Interactive information extraction with constrained conditional random fields
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Learning with probabilistic features for improved pipeline models
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
A simple semi-supervised algorithm for named entity recognition
SemiSupLearn '09 Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing
Weakly supervised learning methods for improving the quality of gene name normalization data
ISMB '05 Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
A probabilistic model of redundancy in information extraction
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Corrective feedback and persistent learning for information extraction
Artificial Intelligence
Reducing the annotation effort for letter-to-phoneme conversion
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Analysis of a probabilistic model of redundancy in unsupervised information extraction
Artificial Intelligence
Conditional random fields for word hyphenation
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Assessment of utility in web mining for the domain of public health
Louhi '10 Proceedings of the NAACL HLT 2010 Second Louhi Workshop on Text and Data Mining of Health Documents
We're not in Kansas anymore: detecting domain changes in streams
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Confidence in structured-prediction using confidence-weighted models
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Service-oriented information extraction
Proceedings of the 2011 Joint EDBT/ICDT Ph.D. Workshop
Confidence driven unsupervised semantic parsing
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Facilitating pattern discovery for relation extraction with semantic-signature-based clustering
Proceedings of the 20th ACM international conference on Information and knowledge management
Semi-supervised training set adaption to unknown countries for traffic sign classifiers
PSL'11 Proceedings of the First IAPR TC3 conference on Partially Supervised Learning
Tuple refinement method based on relationship keyword extension
WISM'12 Proceedings of the 2012 international conference on Web Information Systems and Mining
Assessing sparse information extraction using semantic contexts
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Assessing confidence of knowledge base content with an experimental study in entity resolution
Proceedings of the 2013 workshop on Automated knowledge base construction
Hi-index | 0.00 |
Information extraction techniques automatically create structured databases from unstructured data sources, such as the Web or newswire documents. Despite the successes of these systems, accuracy will always be imperfect. For many reasons, it is highly desirable to accurately estimate the confidence the system has in the correctness of each extracted field. The information extraction system we evaluate is based on a linear-chain conditional random field (CRF), a probabilistic model which has performed well on information extraction tasks because of its ability to capture arbitrary, overlapping features of the input in a Markov model. We implement several techniques to estimate the confidence of both extracted fields and entire multi-field records, obtaining an average precision of 98% for retrieving correct fields and 87% for multi-field records.