Calculating content recency based on timestamped and non-timestamped sources for supporting page quality estimation

Authors:
Adam Jatowt;Yukiko Kawai;Katsumi Tanaka
Affiliations:
Kyoto University, Yoshida-Honmachi, Sakyo-ku, Kyoto, Japan and Microsoft IJARC Fellow;Kyoto Sangyo University, Motoyama, Kamigamo, Kita-Ku, Kyoto, Japan;Kyoto University, Yoshida-Honmachi, Sakyo-ku, Kyoto, Japan
Venue:
Proceedings of the 2011 ACM Symposium on Applied Computing
Year:
2011

Citing 12
Cited 1

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Time-based language models

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Sic transit gloria telae: towards an understanding of the web's decay

Proceedings of the 13th international conference on World Wide Web
What's really new on the web?: identifying new pages from a series of unstable web snapshots

Proceedings of the 15th international conference on World Wide Web
Making sense of credibility on the Web: Models for evaluating online information and recommendations for future research

Journal of the American Society for Information Science and Technology
Integration of news content into web results

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Blog credibility ranking by exploiting verified content

Proceedings of the 3rd workshop on Information credibility on the web
Improving search relevance for implicitly temporal queries

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Towards recency ranking in web search

Proceedings of the third ACM international conference on Web search and data mining
Determining time of queries for re-ranking search results

ECDL'10 Proceedings of the 14th European conference on Research and advanced technology for digital libraries
Estimating News Coverage of Web Search Results

WI-IAT '10 Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Answering General Time-Sensitive Queries

IEEE Transactions on Knowledge and Data Engineering

Studying how the past is remembered: towards computational history through large scale text mining

Proceedings of the 20th ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

The web is characterized by low publishing barriers and contains content of varying degrees of quality and credibility. It is often difficult for web searchers to locate high quality content in returned search results. In this paper, we propose evaluating the extent to which search results contain recent information related to user queries. Our approach is based on corroborating search results with query-related information obtained from timestamped and non-timestamped sources. It uses news articles collected from online news archives and also employs a simple search index mining process to find terms representing fresh topics. As another contribution, we show how the proposed approach can be used for estimating the focus time of web pages, that is, the time periods to which the content of pages refers. We demonstrate the proof-of-concept system that evaluates and visualizes in real time the freshness levels and focus time of web search results.