WisColl: Collective wisdom based blog clustering

Authors:
Nitin Agarwal;Magdiel Galan;Huan Liu;Shankar Subramanya
Affiliations:
Department of Information Science, University of Arkansas at Little Rock, Little Rock, AR 72204, United States;Computer Science and Engineering, Arizona State University, Tempe, AZ 85287, United States;Computer Science and Engineering, Arizona State University, Tempe, AZ 85287, United States;Computer Science and Engineering, Arizona State University, Tempe, AZ 85287, United States
Venue:
Information Sciences: an International Journal
Year:
2010

Citing 16
Cited 8

Algorithms for clustering data

Algorithms for clustering data
Scatter/Gather: a cluster-based approach to browsing large document collections

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Web document clustering: a feasibility demonstration

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Efficient Feature Selection in Conceptual Clustering

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A parallel hybrid web document clustering algorithm and its performance study

The Journal of Supercomputing - Special issue: Parallel and distributed processing and applications
Improved annotation of the blogosphere via autotagging and hierarchical clustering

Proceedings of the 15th international conference on World Wide Web
A social hypertext model for finding community in blogs

Proceedings of the seventeenth conference on Hypertext and hypermedia
LinkClus: efficient clustering via heterogeneous semantic links

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Mining blog stories using community-based and temporal clustering

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Genetic Algorithm-based Text Clustering Technique: Automatic Evolution of Clusters with High Efficiency

WAIMW '06 Proceedings of the Seventh International Conference on Web-Age Information Management Workshops
The Long Tail: Why the Future of Business Is Selling Less of More

The Long Tail: Why the Future of Business Is Selling Less of More
Seeking stable clusters in the blogosphere

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
How valuable is medical social media data? Content analysis of the medical web

Information Sciences: an International Journal
Exploiting noun phrases and semantic relationships for text document clustering

Information Sciences: an International Journal
An analysis of the use of tags in a blog recommender system

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Information retrieval in folksonomies: search and ranking

ESWC'06 Proceedings of the 3rd European conference on The Semantic Web: research and applications

Recommendation in Internet forums and blogs

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Multidimensional social network: model and analysis

ICCCI'11 Proceedings of the Third international conference on Computational collective intelligence: technologies and applications - Volume Part I
Subject-based extraction of a latent blog community

Information Sciences: an International Journal
Hierarchically clustered technical blogs

Proceedings of the International Conference on Advances in Computing, Communications and Informatics
Bridge analysis in a Social Internetworking Scenario

Information Sciences: an International Journal
Social learning network analysis model to identify learning patterns using ontology clustering techniques and meaningful learning

Computers & Education
Business intelligence in blogs: understanding consumer interactions and communities

MIS Quarterly
Finding keywords in blogs: Efficient keyword extraction in blog mining via user behaviors

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.07

Visualization

Abstract

The Blogosphere is expanding in an unprecedented speed. A better understanding of the blogosphere can greatly facilitate the development of the Social Web to serve the needs of users, service providers, and advertisers. One important task in this process is clustering blog sites. Although a good number of traditional clustering methods exists, they are not designed to take into account the blogosphere unique characteristics. Clustering blog sites presents new challenges. A prominent feature of the Social Web is that many enthusiastic bloggers voluntarily write, tag, and catalog their posts in order to reach the widest possible audience who will share their thoughts and appreciate their ideas. In the process a new kind of collective wisdom is generated. We propose WisColl by tapping into this collective wisdom when clustering blog sites. In this paper, we study how clustering with collective wisdom can be achieved and compare it with a representative traditional clustering method. We present statistical and visual results, report findings and suggest future work extending to many real-world applications.