Random sampling with a reservoir
ACM Transactions on Mathematical Software (TOMS)
C4.5: programs for machine learning
C4.5: programs for machine learning
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
An introduction to variable and feature selection
The Journal of Machine Learning Research
A Bayes Optimal Approach for Partitioning the Values of Categorical Attributes
The Journal of Machine Learning Research
On biased reservoir sampling in the presence of stream evolution
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)
Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)
Compression-Based Averaging of Selective Naive Bayes Classifiers
The Journal of Machine Learning Research
Modelling complex data by learning which variable to construct
DaWaK'10 Proceedings of the 12th international conference on Data warehousing and knowledge discovery
Hi-index | 0.00 |
In itself, the continuous exponential increase of the data-warehouses size does not necessarily lead to a richer and finer-grained information since the processing capabilities do not increase at the same rate. Current state-of-the-art technologies require the user to strike a delicate balance between the processing cost and the information quality. We describe an industrial approach which leverages recent advances in treatment automatization and relevant data/instance selection and indexing so as to dramatically improve our capability to turn huge volumes of raw data into useful information.