Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
Building visual language parsers
CHI '91 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
A scalable comparison-shopping agent for the World-Wide Web
AGENTS '97 Proceedings of the first international conference on Autonomous agents
A Parsing Methodology for the Implementation of Visual Systems
IEEE Transactions on Software Engineering
Relational grammars: theory and practice in a visual language interface for process modeling
Visual language theory
IEEE Intelligent Systems
Proceedings of the 27th International Conference on Very Large Data Bases
RoadRunner: Towards Automatic Data Extraction from Large Web Sites
Proceedings of the 27th International Conference on Very Large Data Bases
On the Automatic Extraction of Data from the Hidden Web
Revised Papers from the HUMACS, DASWIS, ECOMO, and DAMA on ER 2001 Workshops
Online parsing of visual languages using adjacency grammars
VL '95 Proceedings of the 11th International IEEE Symposium on Visual Languages
Statistical schema matching across web query interfaces
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Extracting structured data from Web pages
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Crawling for Domain-Speci.c Hidden Web Resources
WISE '03 Proceedings of the Fourth International Conference on Web Information Systems Engineering
Wise-integrator: an automatic integrator of web search interfaces for E-commerce
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Clustering structured web sources: a schema-based, model-differentiation approach
EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
Knocking the door to the deep Web: integrating Web query interfaces
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Mining complex matchings across Web query interfaces
Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Discovering complex matchings across web query interfaces: a correlation mining approach
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Organizing structured web sources by query schemas: a clustering approach
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Structured databases on the web: observations and implications
ACM SIGMOD Record
Editorial: special issue on web content mining
ACM SIGKDD Explorations Newsletter
Mining semantics for large scale integration on the web: evidences, insights, and challenges
ACM SIGKDD Explorations Newsletter
Towards Building a MetaQuerier: Extracting and Matching Web Query Interfaces
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
MetaQuerier: querying structured web sources on-the-fly
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
QA-Pagelet: Data Preparation Techniques for Large-Scale Data Analysis of the Deep Web
IEEE Transactions on Knowledge and Data Engineering
Making holistic schema matching robust: an ensemble approach
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Light-weight domain-based form assistant: querying web databases on the fly
VLDB '05 Proceedings of the 31st international conference on Very large data bases
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Automatic complex schema matching across Web query interfaces: A correlation mining approach
ACM Transactions on Database Systems (TODS)
Model-directed web transactions under constrained modalities
Proceedings of the 15th international conference on World Wide Web
Meaningful labeling of integrated query interfaces
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Spatial graph grammars for graphical user interfaces
ACM Transactions on Computer-Human Interaction (TOCHI)
Model-directed Web transactions under constrained modalities
ACM Transactions on the Web (TWEB)
Automatically maintaining navigation sequences for querying semi-structured web sources
Data & Knowledge Engineering
Assistive browser for conducting web transactions
Proceedings of the 13th international conference on Intelligent user interfaces
Efficient web browsing on small screens
AVI '08 Proceedings of the working conference on Advanced visual interfaces
Automated Semantic Analysis of Schematic Data
World Wide Web
Learning to extract form labels
Proceedings of the VLDB Endowment
Automatic wrapper induction from hidden-web sources with domain knowledge
Proceedings of the 10th ACM workshop on Web information and data management
From queries to search forms: an implementation
International Journal of Computer Applications in Technology
Site-Wide Wrapper Induction for Life Science Deep Web Databases
DILS '09 Proceedings of the 6th International Workshop on Data Integration in the Life Sciences
An empirical study on using hidden markov model for search interface segmentation
Proceedings of the 18th ACM conference on Information and knowledge management
A hierarchical approach to model web query interfaces for web source integration
Proceedings of the VLDB Endowment
Wrapping of Web Sources with restricted Query Interfaces by Query Tunneling
Electronic Notes in Theoretical Computer Science (ENTCS)
Querying capability modeling and construction of deep web sources
WISE'07 Proceedings of the 8th international conference on Web information systems engineering
Supporting keyword queries on structured databases with limited search interfaces
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Automatically incorporating new sources in keyword search-based data integration
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Understanding deep web search interfaces: a survey
ACM SIGMOD Record
Mixture model based label association techniques for web accessibility
UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology
FAETON: Form Analysis and Extraction Tool for ONtology construction
International Journal of Computer Applications in Technology
Deep web integration with VisQI
Proceedings of the VLDB Endowment
On-line web database integration
Proceedings of the International Conference on Management of Emergent Digital EcoSystems
Deep Web adaptive crawling based on minimum executable pattern
Journal of Intelligent Information Systems
Real understanding of real estate forms
Proceedings of the International Conference on Web Intelligence, Mining and Semantics
Attribute domain discovery for hidden web databases
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Layout object model for extracting the schema of web query interfaces
APWeb'11 Proceedings of the 13th Asia-Pacific web conference on Web technologies and applications
Reuse-oriented mapping discovery for meta-querier customization
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I
A study on using two-phase conditional random fields for query interface segmentation
WISM'11 Proceedings of the 2011 international conference on Web information systems and mining - Volume Part II
Web Query Interface Parsing for Building Web-Based Metasearch Systems
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 03
Constructing interface schemas for search interfaces of web databases
WISE'05 Proceedings of the 6th international conference on Web Information Systems Engineering
User-centric adaptation of Web information for small screens
Journal of Visual Languages and Computing
Automatic identification of web query interfaces
MICAI'11 Proceedings of the 10th international conference on Artificial Intelligence: advances in Soft Computing - Volume Part II
OPAL: automated form understanding for the deep web
Proceedings of the 21st international conference on World Wide Web
OPAL: a passe-partout for web forms
Proceedings of the 21st international conference companion on World Wide Web
Optimal algorithms for crawling a hidden database in the web
Proceedings of the VLDB Endowment
Learning to discover complex mappings from web forms to ontologies
Proceedings of the 21st ACM international conference on Information and knowledge management
Automatic discovery of Web Query Interfaces using machine learning techniques
Journal of Intelligent Information Systems
Deep Web Information Retrieval Process: A Technical Survey
International Journal of Information Technology and Web Engineering
Understanding query interfaces by statistical parsing
ACM Transactions on the Web (TWEB)
Web object identification for web automation and meta-search
Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics
Feature-based object identification for web automation
Proceedings of the 28th Annual ACM Symposium on Applied Computing
The ontological key: automatically understanding and integrating forms to access the deep Web
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
Recently, the Web has been rapidly "deepened" by many searchable databases online, where data are hidden behind query forms. For modelling and integrating Web databases, the very first challenge is to understand what a query interface says- or what query capabilities a source supports. Such automatic extraction of interface semantics is challenging, as query forms are created autonomously. Our approach builds on the observation that, across myriad sources, query forms seem to reveal some "concerted structure," by sharing common building blocks. Toward this insight, we hypothesize the existence of a hidden syntax that guides the creation of query interfaces, albeit from different sources. This hypothesis effectively transforms query interfaces into a visual language with a non-prescribed grammar- and, thus, their semantic understanding a parsing problem. Such a paradigm enables principled solutions for both declaratively representing common patterns, by a derived grammar, and systematically interpreting query forms, by a global parsing mechanism. To realize this paradigm, we must address the challenges of a hypothetical syntax- that it is to be derived, and that it is secondary to the input. At the heart of our form extractor, we thus develop a 2P grammar and a best-effort parser, which together realize a parsing mechanism for a hypothetical syntax. Our experiments show the promise of this approach-it achieves above 85% accuracy for extracting query conditions across random sources.