Database machines and database management
Database machines and database management
The effectiveness and efficiency of agglomerative hierarchic clustering in document retrieval
The effectiveness and efficiency of agglomerative hierarchic clustering in document retrieval
Implementing agglomerative hierarchic clustering algorithms for use in document retrieval
Information Processing and Management: an International Journal
Techniques for the measurement of clustering tendency in document retrieval systems
Journal of Information Science
Non-hierarchical document clustering using the ICL distribution array processor
SIGIR '87 Proceedings of the 10th annual international ACM SIGIR conference on Research and development in information retrieval
Algorithms for clustering data
Algorithms for clustering data
Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Recent trends in hierarchic document clustering: a critical review
Information Processing and Management: an International Journal
Comparison of hierarchic agglomerative clustering methods for document retrieval
The Computer Journal
Information Processing and Management: an International Journal
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
The efficiency of inverted index and cluster searches
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
An automatic and tunable document indexing system
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
The cluster hypothesis revisited
SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Concepts of the cover coefficient-based clustering methodology
SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Generation and search of clustered files
ACM Transactions on Database Systems (TODS)
Approximating block accesses in database organizations
Communications of the ACM
SIGIR '83 Proceedings of the 6th annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Dynamic information and library processing
Dynamic information and library processing
Incremental clustering for dynamic information processing
ACM Transactions on Information Systems (TOIS)
Analysis of multiterm queries in a dynamic signature file organization
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
HypIR: a hypertext-based approach to information retrieval
SAC '93 Proceedings of the 1993 ACM/SIGAPP symposium on Applied computing: states of the art and practice
Node re-usability in structured hypertext systems
CSC '93 Proceedings of the 1993 ACM conference on Computer science
Multi-media document representation and retrieval
CSC '91 Proceedings of the 19th annual conference on Computer Science
Efficiency and effectiveness of query processing in cluster-based retrieval
Information Systems
Incremental cluster-based retrieval using compressed cluster-skipping inverted files
ACM Transactions on Information Systems (TOIS)
A recommender system for requirements elicitation in large-scale software projects
Proceedings of the 2009 ACM symposium on Applied Computing
Cover Coefficient-Based Multi-document Summarization
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Automated support for managing feature requests in open forums
Communications of the ACM - A View of Parallel Computing
New event detection and topic tracking in Turkish
Journal of the American Society for Information Science and Technology
Utilizing recommender systems to support software requirements elicitation
Proceedings of the 2nd International Workshop on Recommendation Systems for Software Engineering
Exploiting index pruning methods for clustering XML collections
INEX'09 Proceedings of the Focused retrieval and evaluation, and 8th international conference on Initiative for the evaluation of XML retrieval
On-demand feature recommendations derived from mining public product descriptions
Proceedings of the 33rd International Conference on Software Engineering
An expansion and reranking approach for annotation-based image retrieval from Web
Expert Systems with Applications: An International Journal
Towards systematic analysis of continuous user input
Proceedings of the 4th international workshop on Social software engineering
Algorithms for within-cluster searches using inverted files
ISCIS'06 Proceedings of the 21st international conference on Computer and Information Sciences
A fuzzy ranking approach for improving search results in Turkish as an agglutinative language
Expert Systems with Applications: An International Journal
A new approach to search result clustering and labeling
AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Clustering information retrieval search outputs
IRSG'99 Proceedings of the 21st Annual BCS-IRSG conference on Information Retrieval Research
Cluster searching strategies for collaborative recommendation systems
Information Processing and Management: an International Journal
Exploratory analysis of highly heterogeneous document collections
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.01 |
A new algorithm for document clustering is introduced. The base concept of the algorithm, the cover coefficient (CC) concept, provides a means of estimating the number of clusters within a document database and related indexing and clustering analytically. The CC concept is used also to identify the cluster seeds and to form clusters with these seeds. It is shown that the complexity of the clustering process is very low. The retrieval experiments show that the information-retrieval effectiveness of the algorithm is compatible with a very demanding complete linkage clustering method that is known to have good retrieval performance. The experiments also show that the algorithm is 15.1 to 63.5 (with an average of 47.5) percent better than four other clustering algorithms in cluster-based information retrieval. The experiments have validated the indexing-clustering relationships and the complexity of the algorithm and have shown improvements in retrieval effectiveness. In the experiments two document databases are used: TODS214 and INSPEC. The latter is a common database with 12,684 documents.