The nature of statistical learning theory
The nature of statistical learning theory
An Algorithm that Learns What‘s in a Name
Machine Learning - Special issue on natural language learning
The entity-relationship model—toward a unified view of data
ACM Transactions on Database Systems (TODS) - Special issue: papers from the international conference on very large data bases: September 22–24, 1975, Framingham, MA
A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery
Reducing multiclass to binary: a unifying approach for margin classifiers
The Journal of Machine Learning Research
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Use of support vector learning for chunk identification
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
Factorizing complex models: a case study in mention detection
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
CU-COMSEM: exploring rich features for unsupervised web personal name disambiguation
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Hi-index | 0.00 |
In this paper, we describe an integrated approach to entity mention detection that yields a monolithic, almost language independent system. It is optimal in the sense that all categorical constraints are simultaneously considered. The system is compact and easy to develop and maintain, since only a single set of features and classifiers are needed to be designed and optimized. It is implemented using one-versus-all support vector machine (SVM) classifiers and a number of feature extractors at several linguistic levels. SVMs are well known for their ability to handle a large set of overlapping features with theoretically sound generalization properties. Data sparsity might be an important issue as a result of a large number of classes and relatively moderate training data size. However, we report results that the integrated system performs as good as a pipelined system that decomposes the problem into a few smaller sub-tasks. We conduct all our experiments using ACE 2004 data, evaluate the systems using ACE metrics and report competitive performance.