Cool Blog Identi?cation Using Topic-Based Models

Authors:
Kritsada Sriphaew;Hiroya Takamura;Manabu Okumura
Affiliations:
-;-;-
Venue:
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Year:
2008

Citing 5
Cited 2

Learning and Revising User Profiles: The Identification ofInteresting Web Sites

Machine Learning - Special issue on multistrategy learning
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Latent dirichlet allocation

The Journal of Machine Learning Research
How do users evaluate the credibility of Web sites?: a study with over 2,500 participants

Proceedings of the 2003 conference on Designing for user experiences
Syskill & webert: Identifying interesting web sites

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Cool Blog Classification from Positive and Unlabeled Examples

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Credibility-inspired ranking for blog post retrieval

Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Among a huge number of blogs on the internet, only some of them are considered to have great contents and worth to be explored. We call such kind of blogs cool blogs and attempt to identify them. To solve the cool blog identification problem, we consider three assumptions on cool blogs: (1) cool blogs tend to have definite topics, (2) cool blogs tend to have sufficient amount of blog entries, and (3) cool blogs tend to have certain levels of topic consistency among their blog entries. Corresponding to these assumptions, we extract a mixture of topic probabilities using a topic model, exploit the number of blog entries of each blog, and calculate the topic consistency among blog entries using distance functions over topic probabilities, respectively. We show the benefits of the proposed assumptions through these features. A feature unification model is also presented to achieve highest effectiveness. The experimental results on Japanese blog data show that we can improve the classification results by applying proposed assumptions.