Efficient mining of association rules using closed itemset lattices
Information Systems
Formal Concept Analysis: Mathematical Foundations
Formal Concept Analysis: Mathematical Foundations
Identifying and Filtering Near-Duplicate Documents
COM '00 Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching
Improved robustness of signature-based near-replica detection via lexicon randomization
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Frequent Itemset Mining for Clustering Near Duplicate Web Documents
ICCS '09 Proceedings of the 17th International Conference on Conceptual Structures: Conceptual Structures: Leveraging Semantic Technologies
Hi-index | 0.00 |
We proposed a prototype of near-duplicate detection system for web-shop owners. It's a typical situation for this online businesses to buy description of their goods from so-called copyrighters. Copyrighter can cheat from time to time and provide the owner with some almost identical descriptions for different items. In this paper we demonstrated how we can use FCA for fast clustering and revealing such duplicates in real online perfume shop's datasets.