The orange customer analysis platform

Authors:
Raphaël Féraud;Marc Boullé;Fabrice Clérot;Françoise Fessant;Vincent Lemaire
Affiliations:
Orange Labs, Lannion;Orange Labs, Lannion;Orange Labs, Lannion;Orange Labs, Lannion;Orange Labs, Lannion
Venue:
ICDM'10 Proceedings of the 10th industrial conference on Advances in data mining: applications and theoretical aspects
Year:
2010

Citing 10
Cited 1

Random sampling with a reservoir

ACM Transactions on Mathematical Software (TOMS)
C4.5: programs for machine learning

C4.5: programs for machine learning
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss

Machine Learning - Special issue on learning with probabilistic representations
Similarity Search in High Dimensions via Hashing

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
An introduction to variable and feature selection

The Journal of Machine Learning Research
A Bayes Optimal Approach for Partitioning the Values of Categorical Attributes

The Journal of Machine Learning Research
On biased reservoir sampling in the presence of stream evolution

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
MODL: A Bayes optimal discretization method for continuous attributes

Machine Learning
Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)

Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)
Compression-Based Averaging of Selective Naive Bayes Classifiers

The Journal of Machine Learning Research

Modelling complex data by learning which variable to construct

DaWaK'10 Proceedings of the 12th international conference on Data warehousing and knowledge discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

In itself, the continuous exponential increase of the data-warehouses size does not necessarily lead to a richer and finer-grained information since the processing capabilities do not increase at the same rate. Current state-of-the-art technologies require the user to strike a delicate balance between the processing cost and the information quality. We describe an industrial approach which leverages recent advances in treatment automatization and relevant data/instance selection and indexing so as to dramatically improve our capability to turn huge volumes of raw data into useful information.