Algorithms for clustering data
Algorithms for clustering data
Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Information retrieval: data structures and algorithms
Information retrieval: data structures and algorithms
Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Pivoted document length normalization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Method combination for document filtering
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Matrix computations (3rd ed.)
Bayesian classification (AutoClass): theory and results
Advances in knowledge discovery and data mining
Projections for efficient document clustering
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
A Hierarchical Latent Variable Model for Data Visualization
IEEE Transactions on Pattern Analysis and Machine Intelligence
WebACE: a Web agent for document categorization and exploration
AGENTS '98 Proceedings of the second international conference on Autonomous agents
Document Categorization and Query Generation on the World Wide WebUsing WebACE
Artificial Intelligence Review - Special issue on data mining on the Internet
Unsupervised updating of a classification tree in a dynamic environment
Proceedings of the third annual conference on Autonomous Agents
Bipartite graph partitioning and data clustering
Proceedings of the tenth international conference on Information and knowledge management
Evaluation of hierarchical clustering algorithms for document datasets
Proceedings of the eleventh international conference on Information and knowledge management
Hierarchical Clustering Using Non-Greedy Principal Direction Divisive Partitioning
Information Retrieval
Mining a web citation database for author co-citation analysis
Information Processing and Management: an International Journal
A Decision Criterion for the Optimal Number of Clusters in Hierarchical Clustering
Journal of Global Optimization
Collective Principal Component Analysis from Distributed, Heterogeneous Data
PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Error Analysis of Automatic Speech Recognition Using Principal Direction Divisive Partitioning
ECML '00 Proceedings of the 11th European Conference on Machine Learning
Clustering large unstructured document sets
Computational information retrieval
Algorithms for Bounded-Error Correlation of High Dimensional Data in Microarray Experiments
CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
PROXIMUS: a framework for analyzing very high dimensional discrete-attributed datasets
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Using DAML+OIL to classify intrusive behaviours
The Knowledge Engineering Review
Refining a divisive partitioning algorithm for unsupervised clustering
Design and application of hybrid intelligent systems
Efficient Phrase-Based Document Indexing for Web Document Clustering
IEEE Transactions on Knowledge and Data Engineering
Compression, Clustering, and Pattern Discovery in Very High-Dimensional Discrete-Attribute Data Sets
IEEE Transactions on Knowledge and Data Engineering
Hierarchical Clustering Algorithms for Document Datasets
Data Mining and Knowledge Discovery
A divide-and-merge methodology for clustering
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Towards personalised web intelligence
Knowledge and Information Systems
A parallel hybrid web document clustering algorithm and its performance study
The Journal of Supercomputing - Special issue: Parallel and distributed processing and applications
Nonorthogonal decomposition of binary matrices for bounded-error data compression and analysis
ACM Transactions on Mathematical Software (TOMS)
Orthogonal nonnegative matrix t-factorizations for clustering
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
A divide-and-merge methodology for clustering
ACM Transactions on Database Systems (TODS)
Enhancing the Effectiveness of Clustering with Spectra Analysis
IEEE Transactions on Knowledge and Data Engineering
Regularized clustering for documents
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A comparative analysis on the bisecting K-means and the PDDP clustering algorithms
Intelligent Data Analysis
TaxaMiner: an experimentation framework for automated taxonomy bootstrapping
International Journal of Web and Grid Services
In search of deterministic methods for initializing K-means and Gaussian mixture clustering
Intelligent Data Analysis
Towards effective document clustering: A constrained K-means based approach
Information Processing and Management: an International Journal
Data Set Homeomorphism Transformation Based Meta-clustering
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
The Study of Dynamic Aggregation of Relational Attributes on Relational Data Mining
ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
Non-negative matrix factorization for semi-supervised data clustering
Knowledge and Information Systems
Enhanced bisecting k-means clustering using intermediate cooperation
Pattern Recognition
An unsupervised clustering approach for leukaemia classification based on DNA micro-arrays data
Intelligent Data Analysis
Journal of Computational Physics
Hierarchical-Hyperspherical Divisive Fuzzy C-Means (H2D-FCM) Clustering for Information Retrieval
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Knowledge-assisted recognition of cluster boundaries in gene expression data
Artificial Intelligence in Medicine
Clustering: A neural network approach
Neural Networks
Tree view self-organisation of web content
Neurocomputing
Pattern Recognition
Fast Approximate kNN Graph Construction for High Dimensional Data via Recursive Lanczos Bisection
The Journal of Machine Learning Research
A fast nonparametric noncausal MRF-based texture synthesis scheme using a novel FKDE algorithm
IEEE Transactions on Image Processing
A clustering scheme for large high-dimensional document datasets
ISICA'07 Proceedings of the 2nd international conference on Advances in computation and intelligence
Discretization numbers for multiple-instances problem in relational database
ADBIS'07 Proceedings of the 11th East European conference on Advances in databases and information systems
A fast divisive clustering algorithm using an improved discrete particle swarm optimizer
Pattern Recognition Letters
Enhancing principal direction divisive clustering
Pattern Recognition
Discriminative topic modeling based on manifold learning
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Multilevel manifold learning with application to spectral clustering
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Projection based clustering of gene expression data
CIBB'09 Proceedings of the 6th international conference on Computational intelligence methods for bioinformatics and biostatistics
Computational Models of Learning the Raising-Control Distinction
Research on Language and Computation
Discriminative Topic Modeling Based on Manifold Learning
ACM Transactions on Knowledge Discovery from Data (TKDD)
Document mining based on semantic understanding of text
CIARP'06 Proceedings of the 11th Iberoamerican conference on Progress in Pattern Recognition, Image Analysis and Applications
Principal component analysis for distributed data sets with updating
APPT'05 Proceedings of the 6th international conference on Advanced Parallel Processing Technologies
A quality driven Hierarchical Data Divisive Soft Clustering for information retrieval
Knowledge-Based Systems
Fast orthogonal nonnegative matrix tri-factorization for simultaneous clustering
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Skin lesions characterisation utilising clustering algorithms
SETN'10 Proceedings of the 6th Hellenic conference on Artificial Intelligence: theories, models and applications
Document clustering using linear partitioning hyperplanes and reallocation
AIRS'04 Proceedings of the 2004 international conference on Asian Information Retrieval Technology
Streaming data reduction using low-memory factored representations
Information Sciences: an International Journal
Succinct initialization methods for clustering algorithms
ICIC'11 Proceedings of the 7th international conference on Advanced Intelligent Computing
VoCS'08 Proceedings of the 2008 international conference on Visions of Computer Science: BCS International Academic Conference
Generalizing the k-Windows clustering algorithm in metric spaces
Mathematical and Computer Modelling: An International Journal
Proceedings of the 8th International Conference on Predictive Models in Software Engineering
Random direction divisive clustering
Pattern Recognition Letters
Content based image retrieval system using NOHIS-tree
Proceedings of the 10th International Conference on Advances in Mobile Computing & Multimedia
QUBiC: An adaptive approach to query-based recommendation
Journal of Intelligent Information Systems
A method for the acquisition of ontology-based user profiles
Advances in Engineering Software
Hi-index | 0.00 |
We propose a new algorithm capable of partitioning a set of documents orother samples based on an embedding in a high dimensional Euclidean space (i.e., in which every document is a vector of real numbers). The method isunusual in that it is divisive, as opposed to agglomerative, and operates byrepeatedly splitting clusters into smaller clusters.The documents are assembled into a matrix which is very sparse. It is this sparsity that permits thealgorithm to be very efficient. The performance of the method isillustrated with a set of text documents obtained from the World Wide Web.Some possible extensions are proposed for further investigation.