A study of automation from seed URL generation to focused web archive development: the CTRnet context

Authors:
Seungwon Yang;Kiran Chitturi;Gregory Wilson;Mohamed Magdy;Edward A. Fox
Affiliations:
Virginia Tech, Blacksburg, VA, USA;Virginia Tech, Blacksburg, VA, USA;Virginia Tech, Blacksburg, VA, USA;Virginia Tech, Blacksburg, VA, USA;Virginia Tech, Blacksburg, VA, USA
Venue:
Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
Year:
2012

Citing 1
Cited 0

The utility of tweeted URLs for web search

Proceedings of the 19th international conference on World wide web

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the event of emergencies and disasters, massive amounts of web resources are generated and shared. Due to the rapidly changing nature of those resources, it is important to start archiving them as soon as a disaster occurs. This led us to develop a prototype system for constructing archives with minimum human intervention using the seed URLs extracted from tweet collections. We present the details of our prototype system. We applied it to five tweet collections that had been developed in advance, for evaluation. We also identify five categories of non- relevant files and conclude with a discussion of findings from the evaluation.