Crawling rich internet applications: the state of the art

  • Authors:
  • Suryakant Choudhary;Mustafa Emre Dincturk;Seyed M. Mirtaheri;Ali Moosavi;Gregor von Bochmann;Guy-Vincent Jourdan;Iosif Viorel Onut

  • Affiliations:
  • EECS, University of Ottawa, Ottawa, Ontario, Canada;EECS, University of Ottawa, Ottawa, Ontario, Canada;EECS, University of Ottawa, Ottawa, Ontario, Canada;EECS, University of Ottawa, Ottawa, Ontario, Canada;EECS, University of Ottawa, Ottawa, Ontario, Canada and Fellow of IBM Canada CAS Research, Markham, Ontario, Canada;EECS, University of Ottawa, Ottawa, Ontario, Canada and Fellow of IBM Canada CAS Research, Markham, Ontario, Canada;R&D, IBM®® Security AppScan® Enterprise, Ottawa, Ontario, Canada and IBM Canada Software Lab, Canada

  • Venue:
  • CASCON '12 Proceedings of the 2012 Conference of the Center for Advanced Studies on Collaborative Research
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Web applications have come a long way, both in terms of adoption to provide information and services, and in terms of the technologies to develop them. With the emergence of richer and more advanced technologies such as AJAX, web applications have become more interactive, responsive and user friendly. These applications, often called Rich Internet Applications (RIAs), changed the web applications in two ways: (1) dynamic manipulation of client-side state and (2) asynchronous communication with the server. However, at the same time, such techniques also introduced new challenges. One important challenge is the difficulty of automatically crawling these new applications. Without crawling, RIAs cannot be indexed nor tested automatically. Traditional crawlers are not able to handle these newer technologies. This paper surveys the research on addressing the problem of crawling RIAs and provides some experimental results to compare existing crawling strategies. In addition, we provide some future directions for research on crawling RIAs.