Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Inferring Web communities from link topology
Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems
Hubs, authorities, and communities
ACM Computing Surveys (CSUR)
Efficient identification of Web communities
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering spatial data using random walks
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Web community mining and web log mining: commodity cluster based execution
ADC '02 Proceedings of the 13th Australasian database conference - Volume 5
Learning to Probabilistically Identify Authoritative Documents
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Clustering and Identifying Temporal Trends in Document Databases
ADL '00 Proceedings of the IEEE Advances in Digital Libraries 2000
The Journal of Machine Learning Research
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Group and topic discovery from relations and text
Proceedings of the 3rd international workshop on Link discovery
A latent mixed membership model for relational data
Proceedings of the 3rd international workshop on Link discovery
Extraction and classification of dense communities in the web
Proceedings of the 16th international conference on World Wide Web
Structural and temporal analysis of the blogosphere through community factorization
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A probabilistic framework for relational clustering
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Community-based ranking of the social web
Proceedings of the 21st ACM conference on Hypertext and hypermedia
On community outliers and their efficient detection in information networks
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining topics on participations for community discovery
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Context-based friend suggestion in online photo-sharing community
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Literature search through mixed-membership community discovery
SBP'10 Proceedings of the Third international conference on Social Computing, Behavioral Modeling, and Prediction
Leveraging network structure for incremental document clustering
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
A framework for exploring organizational structure in dynamic social networks
Decision Support Systems
Combining Relations and Text in Scientific Network Clustering
ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Hi-index | 0.00 |
Every piece of textual data is generated as a method to convey its authors' opinion regarding specific topics. Authors deliberately organize their writings and create links, i.e., references, acknowledgments, for better expression. Thereafter, it is of interest to study texts as well as their relations to understand the underlying topics and communities. Although many efforts exist in the literature in data clustering and topic mining, they are not applicable to community discovery on large document corpus for several reasons. First, few of them consider both textual attributes as well as relations. Second, scalability remains a significant issue for large-scale datasets. Additionally, most algorithms rely on a set of initial parameters that are hard to be captured and tuned. Motivated by the aforementioned observations, a hierarchical community model is proposed in the paper which distinguishes community cores from affiliated members. We present our efforts to develop a scalable community discovery solution for large-scale document corpus. Our proposal tries to quickly identify potential cores as seeds of communities through relation analysis. To eliminate the influence of initial parameters, an innovative attribute-based core merge process is introduced so that the algorithm promises to return consistent communities regardless initial parameters. Experimental results suggest that the proposed method has high scalability to corpus size and feature dimensionality, with more than 15 topical precision improvement compared with popular clustering techniques.