Mining the blogosphere for top news stories identification

Authors:
Yeha Lee;Hun-young Jung;Woosang Song;Jong-Hyeok Lee
Affiliations:
POSTECH, SAN 31 HYOJA-DONG NAM-GU, POHANG, 790-784, REPUBLIC OF KOREA, Pohang, South Korea;POSTECH, SAN 31 HYOJA-DONG NAM-GU, POHANG, 790-784, REPUBLIC OF KOREA, Pohang, South Korea;POSTECH, SAN 31 HYOJA-DONG NAM-GU, POHANG, 790-784, REPUBLIC OF KOREA, Pohang, South Korea;POSTECH, SAN 31 HYOJA-DONG NAM-GU, POHANG, 790-784, REPUBLIC OF KOREA, Pohang, South Korea
Venue:
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Year:
2010

Citing 23
Cited 8

A study of retrospective and on-line event detection

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
On-line new event detection and tracking

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Document language models, query models, and risk minimization for information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Novelty and redundancy detection in adaptive filtering

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Bursty and hierarchical structure in streams

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Topic-conditioned novelty detection

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A System for new event detection

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Latent dirichlet allocation

The Journal of Machine Learning Research
A study of smoothing methods for language models applied to information retrieval

ACM Transactions on Information Systems (TOIS)
Text classification and named entities for new event detection

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Query based event extraction along a timeline

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Temporal profiles of queries

ACM Transactions on Information Systems (TOIS)
Hot Topic Extraction Based on Timeline Analysis and Multidimensional Sentence Modeling

IEEE Transactions on Knowledge and Data Engineering
Analyzing feature trajectories for event detection

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
New event detection based on indexing-tree and named entity

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Novelty and diversity in information retrieval evaluation

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Automatic online news topic ranking using media focus and user attention based on aging theory

Proceedings of the 17th ACM conference on Information and knowledge management
Diversifying search results

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Positional language models for information retrieval

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
An improved feedback approach using relevant local posts for blog feed retrieval

Proceedings of the 18th ACM conference on Information and knowledge management
A study of blog search

ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval

Blog track research at TREC

ACM SIGIR Forum
A learned approach for ranking news in real-time using the blogosphere

SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Information Retrieval on the Blogosphere

Foundations and Trends in Information Retrieval
Ranking news events by influence decay and information fusion for media and users

Proceedings of the 21st ACM international conference on Information and knowledge management
Expediting search trend detection via prediction of query counts

Proceedings of the sixth ACM international conference on Web search and data mining
The Retrieval of Important News Stories by Influence Propagation among Communities and Categories

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Term Weighting Schemes for Emerging Event Detection

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Can predicate-argument structures be used for contextual opinion retrieval from blogs?

World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

The analysis of query logs from blog search engines show that news-related queries occupy a significant portion of the logs. This raises a interesting research question on whether the blogosphere can be used to identify important news stories. In this paper, we present novel approaches to identify important news story headlines from the blogosphere for a given day. The proposed system consists of two components based on the language model framework, the query likelihood and the news headline prior. For the query likelihood, we propose several approaches to estimate the query language model and the news headline language model. We also suggest several criteria to evaluate the news headline prior that is the prior belief about the importance or newsworthiness of the news headline for a given day. Experimental results show that our system significantly outperforms a baseline system. Specifically, the proposed approach gives 2.62% and 10.19% further increases in MAP and P@5 over the best performing result of the TREC'09 Top Stories Identification Task.