AJAX Crawl: Making AJAX Applications Searchable

Authors:
Cristian Duda;Gianni Frey;Donald Kossmann;Reto Matter;Chong Zhou
Affiliations:
-;-;-;-;-
Venue:
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Year:
2009

Citing 0
Cited 14

More than meets the eye: a survey of screen-reader browsing strategies

Proceedings of the 2010 International Cross Disciplinary Conference on Web Accessibility (W4A)
A framework for automated testing of javascript web applications

Proceedings of the 33rd International Conference on Software Engineering
Crawling Ajax-Based Web Applications through Dynamic Analysis of User Interface State Changes

ACM Transactions on the Web (TWEB)
Crawling rich internet applications: the state of the art

CASCON '12 Proceedings of the 2012 Conference of the Center for Advanced Studies on Collaborative Research
Web object identification for web automation and meta-search

Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics
Guided test generation for web applications

Proceedings of the 2013 International Conference on Software Engineering
An unabridged source code dataset for research in software reuse

Proceedings of the 10th Working Conference on Mining Software Repositories
Server interface descriptions for automated testing of JavaScript web applications

Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
Extracting URLs from JavaScript via program analysis

Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
Hidden-Web induced by client-side scripting: an empirical study

ICWE'13 Proceedings of the 13th international conference on Web Engineering
Building rich internet applications models: example of a better strategy

ICWE'13 Proceedings of the 13th international conference on Web Engineering
Current challenges in web crawling

ICWE'13 Proceedings of the 13th international conference on Web Engineering
A brief history of web crawlers

CASCON '13 Proceedings of the 2013 Conference of the Center for Advanced Studies on Collaborative Research
The DynaRIA tool for the comprehension of Ajax web applications by dynamic analysis

Innovations in Systems and Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Current search engines such as Google and Yahoo! are prevalent for searching the Web. Search on dynamic client-side Web pages is, however, either inexistent or far from perfect, and not addressed by existing work, for example on Deep Web. This is a real impediment since AJAX and Rich Internet Applications are already very common in the Web. AJAX applications are composed of states which can be seen by the user, but not by the search engine, and changed by the user using client-side events. Current search engines either ignore AJAX applications or produce false negatives. The reason is that crawling client-side code is a difficult problem that cannot be solved naively by invoking user events. The challenges are: lack of caching, duplicate states detection, very granular events, reducing the number of AJAX calls and infinite event invocation. This paper sets the stage for this new search challenge and proposes a solution: it shows how an AJAX Web application can be crawled in the granularity of the application states. A model of AJAX Web sites is presented. An AJAX Crawler and optimizations for caching and duplicate elimination are defined, and finally, the gain in search result quality and corresponding performance price are evaluated on YouTube, a real AJAX application.