Randomized algorithms
Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
ACM Computing Surveys (CSUR)
ROCK: a robust clustering algorithm for categorical attributes
Information Systems
ACM Computing Surveys (CSUR)
Pattern Recognition with Fuzzy Objective Function Algorithms
Pattern Recognition with Fuzzy Objective Function Algorithms
Machine Learning
Techniques of Cluster Algorithms in Data Mining
Data Mining and Knowledge Discovery
Approximate Nearest Neighbor Searching in Multimedia Databases
Proceedings of the 17th International Conference on Data Engineering
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
A Similarity-Based Soft Clustering Algorithm for Documents
DASFAA '01 Proceedings of the 7th International Conference on Database Systems for Advanced Applications
Approximate similarity retrieval with M-trees
The VLDB Journal — The International Journal on Very Large Data Bases
The SMART Retrieval System—Experiments in Automatic Document Processing
The SMART Retrieval System—Experiments in Automatic Document Processing
Fast Approximate Similarity Search in Extremely High-Dimensional Data Sets
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Scaling distributional similarity to large corpora
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
The privacy of k-NN retrieval for horizontal partitioned data: new methods and applications
ADC '07 Proceedings of the eighteenth conference on Australasian database - Volume 63
Approximate searching for distributional similarity
DeepLA '05 Proceedings of the ACL-SIGLEX Workshop on Deep Lexical Acquisition
Geometrical information fusion from WWW and its related information
DNIS'07 Proceedings of the 5th international conference on Databases in networked information systems
Can shared-neighbor distances defeat the curse of dimensionality?
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
Co-location pattern mining for unevenly distributed data: algorithm, experiments and applications
International Journal of Computational Science and Engineering
Multi-source shared nearest neighbours for multi-modal image clustering
Multimedia Tools and Applications
Quality of similarity rankings in time series
SSTD'11 Proceedings of the 12th international conference on Advances in spatial and temporal databases
Face retrieval in broadcasting news video by fusing temporal and intensity information
CIVR'06 Proceedings of the 5th international conference on Image and Video Retrieval
Hi-index | 0.00 |
This paper introduces a scalable method for feature extraction and navigation of large data sets by means of local clustering, where clusters are modeled as overlapping neighborhoods. Under the model, intra-cluster association and external differentiation are both assessed in terms of a natural confidence measure. Minor clusters can be identified even when they appear in the intersection of larger clusters. Scalability of local clustering derives from recent generic techniques for efficient approximate similarity search. The cluster overlap structure gives rise to a hierarchy that can be navigated and queried by users. Experimental results are provided for two large text databases.