Individual differences and decision-making using various levels of aggregation of information
Journal of Management Information Systems
The effects of modes of information presentation on decision-making: a review and meta-analysis
Journal of Management Information Systems
Research problems in data warehousing
CIKM '95 Proceedings of the fourth international conference on Information and knowledge management
Communications of the ACM
The data warehouse and data mining
Communications of the ACM
An overview of data warehousing and OLAP technology
ACM SIGMOD Record
Advances in knowledge discovery and data mining
Advances in knowledge discovery and data mining
Data mining for customer service support
Information and Management
Building the Data Warehouse
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals
Data Mining and Knowledge Discovery
Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem
Data Mining and Knowledge Discovery
Learning When Negative Examples Abound
ECML '97 Proceedings of the 9th European Conference on Machine Learning
Technology and knowledge: bridging a "generating" gap
Information and Management
Theory and support for process frameworks of knowledge discovery and data mining from ERP systems
Information and Management
Analyzing customer behavior at Amazon.com
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
An Empirical Analysis of Data Requirements for Financial Forecasting with Neural Networks
Journal of Management Information Systems
Explaining factors influencing the consumer adoption of broadband
International Journal of Business Information Systems
Hi-index | 0.00 |
We studied the impact of data aggregation on the performance of logistic regression on predicting the direction of the Dow Jones industrial average (DJIA) stock market index. Data aggregation is a common operation in business, science, engineering, medicine, etc.; it is performed for purposes such as statistical, financial, and sales and marketing analysis - particularly within the context of a data warehouse. We showed experimentally that, for this example, as long as aggregation does not shrink the sample size unduly, it does not significantly impair the performance of the logistic regression model for predicting the direction of the DJIA stock market index. We also observed that aggregation-based models are simpler (less over-parameterized) than detail-based models. We used the receiver operating characteristic (ROC) analysis to evaluate the robustness of such predictive models. Specifically, we used the area under the ROC curve as a summary measure of the overall performance of a given model.