Wrapper generation for semi-structured Internet sources
ACM SIGMOD Record
Data integration: a theoretical perspective
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
The perfect search engine is not enough: a study of orienteering behavior in directed search
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Transforming arbitrary tables into logical form with TARTAR
Data & Knowledge Engineering
Freebase: a collaboratively created graph database for structuring human knowledge
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
DBpedia: a nucleus for a web of open data
ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
No Code Required: Giving Users Tools to Transform the Web
No Code Required: Giving Users Tools to Transform the Web
Studying trailfinding algorithms for enhanced web search
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Assessing the scenic route: measuring the value of search trails in web logs
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
For increasingly sophisticated use cases end users often need to extract, combine, and aggregate information from various (often dynamically generated) web pages from multiple websites. Current search engines do not focus on combining information from various web pages in order to answer the overall information need of the user. Semantic Web and Linked Data usually take a static view on the data and rely on providers' cooperation. In this paper, we present a novel approach that enables end users to easily extract data from web pages while they browse, store it locally in their browser as well as structure, integrate and search such data. We propose Datalog rules for integrating and searching the extracted data. We show how cleaning steps and integration rules can be reused to accelerate the cleaning and integration of extracted data. The proposed approach is implemented as a browser plugin. We present its implementation details and report on our evaluation of the plugin concerning user experience and browsing time saving.