AJAXSearch: crawling, indexing and searching web 2.0 applications

Authors:
Cristian Duda;Gianni Frey;Donald Kossmann;Chong Zhou
Affiliations:
ETH Zurich, Switzerland;ETH Zurich, Switzerland;ETH Zurich, Switzerland;Honghuan Univ. of Science and Technology, China
Venue:
Proceedings of the VLDB Endowment
Year:
2008

Citing 5
Cited 6

Modern Information Retrieval

Modern Information Retrieval
DBXplorer: enabling keyword search over relational databases

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
XRANK: ranked keyword search over XML documents

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Flexible and efficient XML search with complex full-text predicates

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Discover: keyword search in relational databases

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases

Data integration in mashups

ACM SIGMOD Record
Web Crawling

Foundations and Trends in Information Retrieval
Taking the OXPath down the deep web

Proceedings of the 14th International Conference on Extending Database Technology
A strategy for efficient crawling of rich internet applications

ICWE'11 Proceedings of the 11th international conference on Web engineering
Crawling rich internet applications: the state of the art

CASCON '12 Proceedings of the 2012 Conference of the Center for Advanced Studies on Collaborative Research
Building rich internet applications models: example of a better strategy

ICWE'13 Proceedings of the 13th international conference on Web Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Current search engines such as Google and Yahoo! are prevalent for searching the Web. Search in dynamic pages, however, is either inexistent or far from perfect. AJAX and Rich Internet Application are such applications. They are increasingly frequent on the Web (in YouTube, Amazon, GMail, Yahoo!Mail) or mobile devices and are offering a high degree of interactivity to the user, by seamlessly loading content from the server without the need to refresh the page. Current search engines cannot correctly index AJAX applications. This produces false positives and false negatives, because search engines do not understand the application logic that loads content dynamically. Crawling an AJAX application is a difficult problem. Since the user invokes events on the page, crawling must identify the different application states generated by the client-side logic. This demo sets the stage for this new type of search and shows that a search engine for AJAX can be built. Among others, the challenges, as opposed to traditional search engines, are: automatically identifying states by triggering events, efficiently crawling application states, avoiding the invocation of potentially very numerous events, scalability in the number of events, duplicate elimination of states, result presentation and aggregation, ranking. The demo presents the AJAX search engine: crawler, indexer and query processor, applied on a real application and showcases challenges and solutions.