Information Processing Letters
Decidable optimization problems for database logic programs
STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
Graph rewriting: an algebraic and logic approach
Handbook of theoretical computer science (vol. B)
Limits to parallel computation: P-completeness theory
Limits to parallel computation: P-completeness theory
Languages, automata, and logic
Handbook of formal languages, vol. 3
A hierarchical approach to wrapper induction
Proceedings of the third annual conference on Autonomous Agents
Managing semistructured data with florid: a deductive object-oriented perspective
Information Systems - Special issue on semistructured data
Building intelligent web applications using lightweight wrappers
Data & Knowledge Engineering - Special issue on heterogeneous information resources need semantic access
Complexity and expressive power of logic programming
ACM Computing Surveys (CSUR)
Expressiveness of structured document query languages based on attribute grammars
Journal of the ACM (JACM)
Foundations of Databases: The Logical Level
Foundations of Databases: The Logical Level
Query automata over finite trees
Theoretical Computer Science
DEByE - Date extraction by example
Data & Knowledge Engineering
A Query Translation Scheme for Rapid Implementation of Wrappers
DOOD '95 Proceedings of the Fourth International Conference on Deductive and Object-Oriented Databases
Visual Web Information Extraction with Lixto
Proceedings of the 27th International Conference on Very Large Data Bases
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
The complexity of XPath query evaluation
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Query Evaluation on Compressed Trees (Extended Abstract)
LICS '03 Proceedings of the 18th Annual IEEE Symposium on Logic in Computer Science
XWRAP: An XML-Enabled Wrapper Construction System for Web Information Sources
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Monadic datalog and the expressive power of languages for Web information extraction
Journal of the ACM (JACM)
Conjunctive queries over trees
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficient algorithms for processing XPath queries
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Information extraction from web documents based on local unranked tree automaton inference
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Relational data mapping in MIQIS
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
The INFOMIX system for advanced integration of incomplete and inconsistent data
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Relational languages for metadata integration
ACM Transactions on Database Systems (TODS)
The SphereSearch engine for unified ranked retrieval of heterogeneous XML and web documents
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Documentum ECI self-repairing wrappers: performance analysis
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Instantiation of Relations for Semantic Annotation
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Declarative information extraction using datalog with embedded extraction predicates
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Toward best-effort information extraction
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
ESWC '07 Proceedings of the 4th European conference on The Semantic Web: Research and Applications
Towards a System for Ontology-Based Information Extraction from PDF Documents
OTM '08 Proceedings of the OTM 2008 Confederated International Conferences, CoopIS, DOA, GADA, IS, and ODBASE 2008. Part II on On the Move to Meaningful Internet Systems
Feature logics based discovery and composition of biological web resources
Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
Datalog±: a unified approach to ontologies and integrity constraints
Proceedings of the 12th International Conference on Database Theory
Attaching UI enhancements to websites with end users
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Unleashing Web 2.0: From Concepts to Creativity
Unleashing Web 2.0: From Concepts to Creativity
Efficiently incorporating user feedback into information extraction and integration programs
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Inheritance and Polymorphism in Datalog: an experience in Model Management
Proceedings of the 2009 conference on Information Modelling and Knowledge Bases XX
Process Algebra-Based Query Workflows
CAiSE '09 Proceedings of the 21st International Conference on Advanced Information Systems Engineering
Engineering search computing applications: vision and challenges
Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Visual extraction of information from web pages
Journal of Visual Languages and Computing
Proceedings of the 13th International Conference on Extending Database Technology
From information to knowledge: harvesting entities and relationships from web sources
Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Providing resilient XPaths for external adaptation engines
Proceedings of the 21st ACM conference on Hypertext and hypermedia
Theory and Practice of Logic Programming
Designing service marts for engineering search computing applications
ICWE'10 Proceedings of the 10th international conference on Web engineering
ObjectRunner: lightweight, targeted extraction and querying of structured web data
Proceedings of the VLDB Endowment
SXPath: extending XPath towards spatial querying on web documents
Proceedings of the VLDB Endowment
A step-by-step debugging technique to facilitate mashup development and maintenance
Proceedings of the 3rd and 4th International Workshop on Web APIs and Services Mashups
WS-Aggregation: distributed aggregation of web services data
Proceedings of the 2011 ACM Symposium on Applied Computing
The model checking problem for prefix classes of second-order logic: a survey
Fields of logic and computation
Datalog and emerging applications: an interactive tutorial
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
A rule-based system for end-user e-mail annotations
Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference
Towards a spatial instance learning method for deep web pages
ICDM'11 Proceedings of the 11th international conference on Advances in data mining: applications and theoretical aspects
Proceedings of the 20th ACM international conference on Information and knowledge management
Data integration: a challenging ASP application
LPNMR'05 Proceedings of the 8th international conference on Logic Programming and Nonmonotonic Reasoning
N-ary queries by tree automata
DBPL'05 Proceedings of the 10th international conference on Database Programming Languages
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
The lixto project: exploring new frontiers of web data extraction
BNCOD'06 Proceedings of the 23rd British National Conference on Databases, conference on Flexible and Efficient Information Handling
Search Computing
Polymorphism in datalog and inheritance in a metamodel
FoIKS'10 Proceedings of the 6th international conference on Foundations of Information and Knowledge Systems
Datalog-Related aspects in lixto visual developer
Datalog'10 Proceedings of the First international conference on Datalog Reloaded
Datalog relaunched: simulation unification and value invention
Datalog'10 Proceedings of the First international conference on Datalog Reloaded
Information extraction from web pages based on their visual representation
ICWE'11 Proceedings of the 11th international conference on Current Trends in Web Engineering
A general theory of spatial relations to support a graphical tool for visual information extraction
Journal of Visual Languages and Computing
Web Semantics: Science, Services and Agents on the World Wide Web
Knowledge harvesting in the big-data era
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
A framework for learning web wrappers from the crowd
Proceedings of the 22nd international conference on World Wide Web
Hi-index | 0.00 |
We present the Lixto project, which is both a research project in database theory and a commercial enterprise that develops Web data extraction (wrapping) and Web service definition software. We discuss the project's main motivations and ideas, in particular the use of a logic-based framework for wrapping. Then we present theoretical results on monadic datalog over trees and on Elog, its close relative which is used as the internal wrapper language in the Lixto system. These results include both a characterization of the expressive power and the complexity of these languages. We describe the visual wrapper specification process in Lixto and various practical aspects of wrapping. We discuss work on the complexity of query languages for trees that was inseminated by our theoretical study of logic-based languages for wrapping. Then we return to the practice of wrapping and the Lixto Transformation Server, which allows for streaming integration of data extracted from Web pages. This is a natural requirement in complex services based on Web wrapping. Finally, we discuss industrial applications of Lixto and point to open problems for future study.