Assessing agreement on classification tasks: the kappa statistic
Computational Linguistics
Improving the effectiveness of information retrieval with local context analysis
ACM Transactions on Information Systems (TOIS)
Natural Language Engineering
Extracting important sentences with support vector machines
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Robust temporal processing of news
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Deriving marketing intelligence from online discussion
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Working Set Selection Using Second Order Information for Training Support Vector Machines
The Journal of Machine Learning Research
ARSA: a sentiment-aware model for predicting sales performance using blogs
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Support Vector Machines
Design challenges and misconceptions in named entity recognition
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Reading the markets: forecasting public opinion of political candidates by news analysis
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Constructing efficient information extraction pipelines
Proceedings of the 20th ACM international conference on Information and knowledge management
Automatic pipeline construction for real-time annotation
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Information extraction as a filtering task
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.01 |
Strategic business decision making involves the analysis of market forecasts. Today, the identification and aggregation of relevant market statements is done by human experts, often by analyzing documents from the World Wide Web. We present an efficient information extraction chain to automate this complex natural language processing task and show results for the identification part. Based on time and money extraction, we identify sentences that represent statements on revenue using support vector classification. We provide a corpus with German online news articles, in which more than 2,000 such sentences are annotated by domain experts from the industry. On the test data, our statement identification algorithm achieves an overall precision and recall of 0.86 and 0.87 respectively.