Predicting risk from financial reports with regression

Authors:
Shimon Kogan;Dimitry Levin;Bryan R. Routledge;Jacob S. Sagi;Noah A. Smith
Affiliations:
University of Texas at Austin, Austin, TX;Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA;Vanderbilt University, Nashville, TN;Carnegie Mellon University, Pittsburgh, PA
Venue:
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Year:
2009

Citing 9
Cited 8

An application of least squares fit mapping to text information retrieval

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Making large-scale support vector machine learning practical

Advances in kernel methods
Language models for financial news recommendation

Proceedings of the ninth international conference on Information and knowledge management
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Recognizing text genres with simple metrics using discriminant analysis

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
A Linear Least Squares Fit mapping method for information retrieval from natural language texts

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Thumbs up?: sentiment classification using machine learning techniques

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Reading the markets: forecasting public opinion of political candidates by news analysis

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Creating subjective and objective sentence classifiers from unannotated texts

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing

Movie reviews and revenues: an experiment in text regression

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Hunting for the black swan: risk mining from text

ACLDemos '10 Proceedings of the ACL 2010 System Demonstrations
Stock price movement prediction using representative prototypes of financial reports

ACM Transactions on Management Information Systems (TMIS)
Predicting a scientific community's response to an article

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Textual predictors of bill survival in congressional committees

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Word salad: relating food prices and descriptions

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Risk ranking from financial reports

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Large-scale linear support vector regression

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We address a text regression problem: given a piece of text, predict a real-world continuous quantity associated with the text's meaning. In this work, the text is an SEC-mandated financial report published annually by a publicly-traded company, and the quantity to be predicted is volatility of stock returns, an empirical measure of financial risk. We apply well-known regression techniques to a large corpus of freely available financial reports, constructing regression models of volatility for the period following a report. Our models rival past volatility (a strong baseline) in predicting the target variable, and a single model that uses both can significantly outperform past volatility. Interestingly, our approach is more accurate for reports after the passage of the Sarbanes-Oxley Act of 2002, giving some evidence for the success of that legislation in making financial reports more informative.