Efficient Compression of Web Graphs
COCOON '08 Proceedings of the 14th annual international conference on Computing and Combinatorics
A Versatile Record Linkage Method by Term Matching Model Using CRF
DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Extracting Research Communities by Improved Maximum Flow Algorithm
KES '09 Proceedings of the 13th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems: Part II
Extracting local web communities using lexical similarity
DASFAA'10 Proceedings of the 15th international conference on Database systems for advanced applications
An improved algorithm for extracting research communities from bibliographic data
DASFAA'10 Proceedings of the 15th international conference on Database systems for advanced applications
Extracting research communities from bibliographic data
International Journal of Knowledge-based and Intelligent Engineering Systems - Intelligent Information Processing: Techniques and Applications
Hi-index | 0.00 |
There are several methods for mining communities on the Web using hyperlinks. One of the well-known ones is a max-flow based method proposed by Flake et al. The method adopts a page-oriented framework, that is, it uses a page on the Web as a unit of information, like other methods including HITS and trawling. Recently, Asano et al. built a site-oriented framework which uses a site as a unit of information, and they experimentally showed that trawling on the site-oriented framework often outputs significantly better communities than trawling on the page-oriented framework. However, it has not been known whether the site-oriented framework is effective in mining communities through the max-flow based method. In this paper, we first point out several problems of the max-flow based method, mainly owing to the page-oriented framework, and then propose solutions to the problems by utilizing several advantages of the site-oriented framework. Computational experiments reveal that our max-flow based method on the site-oriented framework is very effective in mining communities, related to the topics of given pages, in comparison with the original max-flow based method on the page-oriented framework.