Fast decoding and optimal decoding for machine translation
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
KnowItNow: fast, scalable information extraction from the web
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Extremely fast text feature extraction for classification and indexing
Proceedings of the 17th ACM conference on Information and knowledge management
Design challenges and misconceptions in named entity recognition
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Components for information extraction: ontology-based information extractors and generic platforms
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Very high accuracy and fast dependency parsing is not a contradiction
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Efficient statement identification for automatic market forecasting
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Coarse-to-fine natural language processing
Coarse-to-fine natural language processing
A high-performance syntactic and semantic dependency parser
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations
In praise of laziness: a lazy strategy for web information extraction
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Automatic pipeline construction for real-time annotation
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Information extraction as a filtering task
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.01 |
Information Extraction (IE) pipelines analyze text through several stages. The pipeline's algorithms determine both its effectiveness and its run-time efficiency. In real-world tasks, however, IE pipelines often fail acceptable run-times because they analyze too much task-irrelevant text. This raises two interesting questions: 1) How much "efficiency potential" depends on the scheduling of a pipeline's algorithms? 2) Is it possible to devise a reliable method to construct efficient IE pipelines? Both questions are addressed in this paper. In particular, we show how to optimize the run-time efficiency of IE pipelines under a given set of algorithms. We evaluate pipelines for three algorithm sets on an industrially relevant task: the extraction of market forecasts from news articles. Using a system-independent measure, we demonstrate that efficiency gains of up to one order of magnitude are possible without compromising a pipeline's original effectiveness.