Decisional guidance for computer-based decision support
MIS Quarterly
C4.5: programs for machine learning
C4.5: programs for machine learning
The nature of statistical learning theory
The nature of statistical learning theory
Neural networks for pattern recognition
Neural networks for pattern recognition
BoosTexter: A Boosting-based Systemfor Text Categorization
Machine Learning - Special issue on information retrieval
Data Mining and Knowledge Discovery
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Text Categorization Using Weight Adjusted k-Nearest Neighbor Classification
PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
Evaluating the Impact of Dss, Cognitive Effort, and Incentives on Strategy Selection
Information Systems Research
Providing Decisional Guidance for Multicriteria Decision Making in Groups
Information Systems Research
DSS Effectiveness in Marketing Resource Allocation Decisions: Reality vs. Perception
Information Systems Research
Learning to crawl: Comparing classification schemes
ACM Transactions on Information Systems (TOIS)
Rule Based Expert Systems: The Mycin Experiments of the Stanford Heuristic Programming Project (The Addison-Wesley series in artificial intelligence)
The effects of structural characteristics of explanations on use of a DSS
Decision Support Systems
Text mining techniques for patent analysis
Information Processing and Management: an International Journal
Explaining Classifications For Individual Instances
IEEE Transactions on Knowledge and Data Engineering
Get another label? improving data quality and data mining using multiple, noisy labelers
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
LIBLINEAR: A Library for Large Linear Classification
The Journal of Machine Learning Research
IEEE Transactions on Software Engineering
Opinion Mining and Sentiment Analysis
Foundations and Trends in Information Retrieval
Web page classification: Features and algorithms
ACM Computing Surveys (CSUR)
Document-Word Co-regularization for Semi-supervised Sentiment Analysis
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Audience selection for on-line brand advertising: privacy-friendly social network targeting
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Explaining instance classifications with interactions of subsets of feature values
Data & Knowledge Engineering
OPUS: an efficient admissible algorithm for unordered search
Journal of Artificial Intelligence Research
How Incorporating Feedback Mechanisms in a DSS Affects DSS Evaluations
Information Systems Research
An Efficient Explanation of Individual Classifications using Game Theory
The Journal of Machine Learning Research
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
How to Explain Individual Classification Decisions
The Journal of Machine Learning Research
Design science in information systems research
MIS Quarterly
Predictive analytics in information systems research
MIS Quarterly
Design principles of massive, robust prediction systems
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
Many document classification applications require human understanding of the reasons for data-driven classification decisions by managers, client-facing employees, and the technical team. Predictive models treat documents as data to be classified, and document data are characterized by very high dimensionality, often with tens of thousands to millions of variables (words). Unfortunately, due to the high dimensionality, understanding the decisions made by document classifiers is very difficult. This paper begins by extending the most relevant prior theoretical model of explanations for intelligent systems to account for some missing elements. The main theoretical contribution is the definition of a new sort of explanation as a minimal set of words (terms, generally), such that removing all words within this set from the document changes the predicted class from the class of interest. We present an algorithm to find such explanations, as well as a framework to assess such an algorithm's performance. We demonstrate the value of the new approach with a case study from a real-world document classification task: classifying web pages as containing objectionable content, with the goal of allowing advertisers to choose not to have their ads appear on those pages. A second empirical demonstration on news-story topic classification shows the explanations to be concise and document-specific, and to be capable of providing understanding of the exact reasons for the classification decisions, of the workings of the classification models, and of the business application itself. We also illustrate how explaining the classifications of documents can help to improve data quality and model performance.