Categorizing information objects from user access patterns
Proceedings of the eleventh international conference on Information and knowledge management
ReCoM: reinforcement clustering of multi-type interrelated data objects
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Mining web navigations for intelligence
Decision Support Systems - Special issue: Intelligence and security informatics
Clustering heterogeneous data using clustering by compression
ICCOMP'09 Proceedings of the WSEAES 13th international conference on Computers
Web Co-clustering of Usage Network Using Tensor Decomposition
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Mining web navigations for intelligence
Decision Support Systems - Special issue: Intelligence and security informatics
A new method for clustering heterogeneous data: clustering by compression
WSEAS Transactions on Computers
A proposal for news recommendation based on clustering techniques
IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part III
A similarity reinforcement algorithm for heterogeneous web pages
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
A similarity-aware multiagent-based web content management scheme
ICMLC'05 Proceedings of the 4th international conference on Advances in Machine Learning and Cybernetics
Discovering conceptual page hierarchy of a web site from user traversal history
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
An overview of web data clustering practices
EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
Clustering of search engine keywords using access logs
DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Performance improvement of web caching in Web 2.0 via knowledge discovery
Journal of Systems and Software
Hi-index | 0.00 |
A problem facing information retrieval on the web is how to effectively cluster large amounts of web documents. One approach is to cluster the documents based on information provided only by users usage logs and not by the content of the documents. In this paper, we present a recursive density based clustering algorithm that can adaptively change its parameters intelligently. Our clustering algorithm RDBC is based on DBSCAN, a density based algorithm that has been proven in its ability in processing very large datasets. The fact that DBSCAN does not require the pre-determination of the number of clusters and is linear in time complexity makes it particularly attractive in web page clustering. It can be shown that RDBC require the same time complexity as that of the DBSCAN algorithm. In addition, we prove both analytically and experimentally that our method yields clustering results that are superior to that of DBSCAN