The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
Modern Information Retrieval
Empirical evaluation of dissimilarity measures for color and texture
Computer Vision and Image Understanding - Special issue on empirical evaluation of computer vision algorithms
SIAM Journal on Discrete Mathematics
Efficient Graph-Based Image Segmentation
International Journal of Computer Vision
Spectral methods for multi-scale feature extraction and data clustering
Spectral methods for multi-scale feature extraction and data clustering
PageRank without hyperlinks: structural re-ranking using links induced by language models
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Respect my authority!: HITS without hyperlinks, utilizing cluster-based language models
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Building implicit links from content for forum search
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
BlogRank: ranking weblogs based on connectivity and similarity features
AAA-IDEA '06 Proceedings of the 2nd international workshop on Advanced architectures and algorithms for internet delivery and applications
IEEE Transactions on Knowledge and Data Engineering
Meme-tracking and the dynamics of the news cycle
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Cluster-Centric Approach to News Event Extraction
Proceedings of the 2008 conference on New Trends in Multimedia and Network Information Systems
Framework for evaluating clustering algorithms in duplicate detection
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
In this paper we focus on the problem of ranking news stories within their historical context by exploiting their content similarity. We observe that news stories evolve and thus have to be ranked in a time and query dependent manner. We do this in two steps. First, the mining step discovers metastories, which constitute meaningful groups of similar stories that occur at arbitrary points in time. Second, the ranking step uses well known measures of content similarity to construct implicit links among all metastories, and uses them to rank those metastories that overlap the time interval provided in a user query. We use real data from conventional and social media sources (weblogs) to study the impact of different meta-aggregation techniques and similarity measures in the final ranking. We evaluate the framework using both objective and subjective criteria, and discuss the selection of clustering method and similarity measure that lead to the best ranking results.