A corpus-based approach to language learning
A corpus-based approach to language learning
An Extendible Regular Expression Compiler for Finite-State Approaches in Natural Language Processing
WIA '99 Revised Papers from the 4th International Workshop on Automata Implementation
Nymble: a high-performance learning name-finder
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
An information extraction core system for real world German text processing
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Learning to recognize names across languages
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
MITRE: description of the Alembic system used for MUC-6
MUC6 '95 Proceedings of the 6th conference on Message understanding
CRL/NMSU: description of the CRL/NMSU systems used for MUC-6
MUC6 '95 Proceedings of the 6th conference on Message understanding
The NYU system for MUC-6 or where's the syntax?
MUC6 '95 Proceedings of the 6th conference on Message understanding
University of Sheffield: description of the LaSIE system as used for MUC-6
MUC6 '95 Proceedings of the 6th conference on Message understanding
Automatic rule learning exploiting morphological features for named entity recognition in Turkish
Journal of Information Science
A greek named-entity recognizer that uses support vector machines and active learning
SETN'06 Proceedings of the 4th Helenic conference on Advances in Artificial Intelligence
Hi-index | 0.00 |
In this paper, we describe work in progress for the development of a Greek named entity recognizer. The system aims at information extraction applications where large scale text processing is needed. Speed of analysis, system robustness, and results accuracy have been the basic guidelines for the system's design. Pattern matching techniques have been implemented on top of an existing automated pipeline for Greek text processing and the resulting system depends on non-recursive regular expressions in order to capture different types of named entities. For development and testing purposes, we collected a corpus of financial texts from several web sources and manually annotated part of it. Overall precision and recall are 86% and 81% respectively.