A strategy for efficient crawling of rich internet applications

  • Authors:
  • Kamara Benjamin;Gregor Von Bochmann;Mustafa Emre Dincturk;Guy-Vincent Jourdan;Iosif Viorel Onut

  • Affiliations:
  • SITE, University of Ottawa, Ottawa, ON, Canada;SITE, University of Ottawa, Ottawa, ON, Canada;SITE, University of Ottawa, Ottawa, ON, Canada;SITE, University of Ottawa, Ottawa, ON, Canada;Research and Development, IBM, Rational, AppScan Enterprise, IBM, Ottawa, ON, Canada

  • Venue:
  • ICWE'11 Proceedings of the 11th international conference on Web engineering
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

New web application development technologies such as Ajax, Flex or Silverlight result in so-called Rich Internet Applications (RIAs) that provide enhanced responsiveness, but introduce new challenges for crawling that cannot be addressed by the traditional crawlers. This paper describes a novel crawling technique for RIAs. The technique first generates an optimal crawling strategy for an anticipated model of the crawled RIA by aiming at discovering new states as quickly as possible. As the strategy is executed, if the discovered portion of the actual model of the application deviates from the anticipated model, the anticipated model and the strategy are updated to conform to the actual model. We compare the performance of our technique to a number of existing ones as well as depth-first and breadth-first crawling on some Ajax test applications. The results show that our technique has a better performance often with a faster rate of state discovery.