C4.5: programs for machine learning
C4.5: programs for machine learning
WordNet: a lexical database for English
Communications of the ACM
Finding a domain-appropriate sense inventory for semantically tagging a corpus
Natural Language Engineering
Nymble: a high-performance learning name-finder
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Generalizing automatically generated selectional patterns
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
A "not-so-shallow" parser for collocational analysis
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Word-sense disambiguation using statistical models of Roget's categories trained on large corpora
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Finite-state phrase parsing by rule sequences
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
CRL/NMSU: description of the CRL/NMSU systems used for MUC-6
MUC6 '95 Proceedings of the 6th conference on Message understanding
HLT '91 Proceedings of the workshop on Speech and Natural Language
Using text processing techniques to automatically enrich a domain ontology
Proceedings of the international conference on Formal Ontology in Information Systems - Volume 2001
Automatic construction of English/Chinese parallel corpora
Journal of the American Society for Information Science and Technology
Unsupervised named entity classification models and their ensembles
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Using machine learning to maintain rule-based named-entity recognition and classification systems
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Challenges and resources for evaluating geographical IR
Proceedings of the 2005 workshop on Geographic information retrieval
Automatic feature thesaurus enrichment: extracting generic terms from digital gazetteer
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Exploiting named entity taggers in a second language
ACLstudent '05 Proceedings of the ACL Student Research Workshop
Learning named entity recognition in portuguese from spanish
CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
On the need to bootstrap ontology learning with extraction grammar learning
ICCS'05 Proceedings of the 13th international conference on Conceptual Structures: common Semantics for Sharing Knowledge
Hi-index | 0.00 |
The recognition of Proper Nouns (PNs) is considered an important task in the area of Information Retrieval and Extraction. However the high performance of most existing PN classifiers heavily depends upon the availability of large dictionaries of domain-specific Proper Nouns, and a certain amount of manual work for rule writing or manual tagging. Though it is not a heavy requirement to rely on some existing PN dictionary (often these resources are available on the web), its coverage of a domain corpus may be rather low, in absence of manual updating. In this paper we propose a technique for the automatic updating of an PN Dictionary through the cooperation of an inductive and a probabilistic classifier. In our experiments we show that, whenever an existing PN Dictionary allows the identification of 50% of the proper nouns within a corpus, our technique allows, without additional manual effort, the successful recognition of about 90% of the remaining 50%.