BoosTexter: A Boosting-based Systemfor Text Categorization
Machine Learning - Special issue on information retrieval
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization
IEEE Transactions on Knowledge and Data Engineering
ML-KNN: A lazy learning approach to multi-label learning
Pattern Recognition
Random k-Labelsets: An Ensemble Method for Multilabel Classification
ECML '07 Proceedings of the 18th European conference on Machine Learning
Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web
Management Science
Competition Among Virtual Communities and User Valuation: The Case of Investing-Related Communities
Information Systems Research
A quantitative stock prediction system based on financial news
Information Processing and Management: an International Journal
Learning multi-label alternating decision trees from texts and data
MLDM'03 Proceedings of the 3rd international conference on Machine learning and data mining in pattern recognition
Combine multi-valued attribute decomposition with multi-label learning
Expert Systems with Applications: An International Journal
Multi-label classification and extracting predicted class hierarchies
Pattern Recognition
Managing Data Quality Risk in Accounting Information Systems
Information Systems Research
Hi-index | 0.00 |
This study develops, implements, and evaluates a multilabel text classification algorithm called the multilabel categorical K-nearest neighbor (ML-CKNN). The proposed algorithm is designed to automatically identify 25 types of risk factors with specific meanings reported in Section 1A of SEC form 10-K. The idea of ML-CKNN is to compute a categorical similarity score for each label by the K-nearest neighbors in that category. ML-CKNN is tailored to achieve the goal of extracting risk factors from 10Ks. The proposed algorithm can perfectly classify 74.94% of risk factors and 98.75% of labels. Moreover, ML-CKNN is empirically shown to outperform ML-KNN and other multilabel algorithms. The extracted risk factors could be valuable to empirical studies in accounting or finance.