Information Retrieval
Evaluation of hierarchical clustering algorithms for document datasets
Proceedings of the eleventh international conference on Information and knowledge management
On Clustering Validation Techniques
Journal of Intelligent Information Systems
PATENT '03 Proceedings of the ACL-2003 workshop on Patent corpus processing - Volume 20
Novel labeling strategies for hierarchical representation of multidimensional data analysis results
AIA '08 Proceedings of the 26th IASTED International Conference on Artificial Intelligence and Applications
IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part III
Classifying French verbs using French and English lexical resources
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Hi-index | 0.00 |
This paper focuses on the problem of data classification whenever these data are associated with multiple labels. It especially deals with the case where each label has no antagonistic label and the absence of a label for a data does not necessarily imply that this data cannot have said label, e.g. the substances in mineral exploration, the keywords of the Web pages, . . . We propose new clustering quality measurements which are adapted to data associated with multiple labels. Said measurements are based on the use of two main informations: the similarity between the data given by the clustering algorithm and the distribution of the labels in the model after a projection of these labels on the classification model. Their main area of application is the clustering model selection problem. They can also be used for determining the stopping criterion for the clustering algorithm training. An experimentation of the proposed measurements in the documentary data analysis field shows that they significantly outperform the state of the art.