Towards an automatic construction of Contextual Attribute-Value Taxonomies

Authors:
Dino Ienco;Yoann Pitarch;Pascal Poncelet;Maguelonne Teisseire
Affiliations:
IRSTEA, Montpellier, France and LIRMM, Montpellier, France;Aalborg University, Aalborg, Denmark;LIRMM, Montpellier, France;IRSTEA, Montpellier, France and LIRMM, Montpellier, France
Venue:
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Year:
2012

Citing 14
Cited 0

Multi-dimensional sequential pattern mining

Proceedings of the tenth international conference on Information and knowledge management
Applications of Data Mining to Electronic Commerce

Data Mining and Knowledge Discovery
Protecting Respondents' Identities in Microdata Release

IEEE Transactions on Knowledge and Data Engineering
Transforming data to satisfy privacy constraints

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
AVT-NBL: An Algorithm for Learning Compact and Accurate Naïve Bayes Classifiers from Attribute Value Taxonomies and Data

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
An experimental study on automatically labeling hierarchical clusters using statistical features

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
k-Anonymization with Minimal Loss of Information

IEEE Transactions on Knowledge and Data Engineering
Enhancing cluster labeling using wikipedia

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Mining multidimensional and multilevel sequential patterns

ACM Transactions on Knowledge Discovery from Data (TKDD)
Reducing metadata complexity for faster table summarization

Proceedings of the 13th International Conference on Extending Database Technology
Context-aware generalization for cube measures

DOLAP '10 Proceedings of the ACM 13th international workshop on Data warehousing and OLAP
Krimp: mining itemsets that compress

Data Mining and Knowledge Discovery
Bayesian network learning with abstraction hierarchies and context-specific independence

ECML'05 Proceedings of the 16th European conference on Machine Learning
From Context to Distance: Learning Dissimilarity for Categorical Data Clustering

ACM Transactions on Knowledge Discovery from Data (TKDD)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In many domains (e.g., data mining, data management, data warehouse), a hierarchical organization of attribute values can help the data analysis process. Nevertheless, such hierarchical knowledge does not always available or even may be inadequate or useless when exists. Starting from this consideration, in this paper we tackle the problem of the automatic definition of data-driven taxonomies. To do this we combine techniques coming from information theory and clustering to obtain a structured representation of the attribute values: the Contextual Attribute-Value Taxonomy (CAVT). The two main advantages of our method are to be fully unsupervised (i.e., without any knowledge provided by an expert) and parameter-free. We experiments the benefit of use CAVTs in the two following tasks: (i) the multilevel multidimensional sequential pattern mining problem in which hierarchies are involved to exploit abstraction over the data, (ii) the table summarization problem, in which the hierarchies are used to aggregate the data to supply a sketch of the original information to the user. To validate our approach we use real world datasets in which we obtain appreciable results regarding both quantitative and qualitative evaluation.