Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
An Algorithm that Learns What‘s in a Name
Machine Learning - Special issue on natural language learning
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A maximum entropy approach to named entity recognition
A maximum entropy approach to named entity recognition
Unsupervised word sense disambiguation rivaling supervised methods
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
A bootstrapping approach to named entity classification using successive learners
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
A comparison of algorithms for maximum entropy parameter estimation
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Tagging of name records for genealogical data browsing
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Syntax-based semi-supervised named entity tagging
ACLdemo '05 Proceedings of the ACL 2005 on Interactive poster and demonstration sessions
Named entity recognition in Vietnamese using classifier voting
ACM Transactions on Asian Language Information Processing (TALIP)
Confidence estimation for information extraction
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
A simple semi-supervised algorithm for named entity recognition
SemiSupLearn '09 Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing
One class per named entity: exploiting unlabeled text for named entity recognition
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Semi-supervised learning for relation extraction in Vietnamese text
Proceedings of the Second Symposium on Information and Communication Technology
ICCCI'12 Proceedings of the 4th international conference on Computational Collective Intelligence: technologies and applications - Volume Part I
VNLP: an open source framework for Vietnamese natural language processing
Proceedings of the Fourth Symposium on Information and Communication Technology
Hi-index | 0.00 |
Named entity recognition (NER) is the process of seeking to locate atomic elements in text into predefined categories such as the names of persons, organizations and locations.Most existingNERsystems are based on supervised learning. This method often requires a large amount of labelled training data, which is very time-consuming to build. To solve this problem, we introduce a semi-supervised learning method for recognizing named entities in Vietnamese text by combining proper name coreference, named-ambiguityheuristicswithapowerful sequential learningmodel,Conditional RandomFields. Our approach inherits the idea of Liao and Veeramachaneni [6] and expands it by using proper name coreference. Starting by training the model using a small data set that is annotated manually, the learning model extracts high confident named entities and finds low confident ones by using proper name coreference rules. The low confident named entities are put in the training set to learn new context features. The F-scores of the systemfor extracting "Person", "Location" and "Organization" entities are 83.36%, 69.53% and 65.71%when applying heuristics proposed by Liao andVeeramachaneni.Those valueswhen using our proposed heuristics are 93.13%, 88.15% and 79.35%, respectively. It shows that our method is good in increasing the system accuracy.