Algorithms for finding patterns in strings
Handbook of theoretical computer science (vol. A)
Recent advances of grammatical inference
Theoretical Computer Science - Special issue on algorithmic learning theory
Wrapper generation for semi-structured Internet sources
ACM SIGMOD Record
Database techniques for the World-Wide Web: a survey
ACM SIGMOD Record
Generating finite-state transducers for semi-structured data extraction from the Web
Information Systems - Special issue on semistructured data
The Constraint-Based Knowledge Broker System
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Wrapping Web Information Providers by Transducer Induction
EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Information Extraction in Structured Documents Using Tree Automata Induction
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Mining Web Informative Structures and Contents Based on Entropy Analysis
IEEE Transactions on Knowledge and Data Engineering
Learning (k,l)-contextual tree languages for information extraction
ECML'05 Proceedings of the 16th European conference on Machine Learning
Identifying content blocks from web documents
ISMIS'05 Proceedings of the 15th international conference on Foundations of Intelligent Systems
Ten open problems in grammatical inference
ICGI'06 Proceedings of the 8th international conference on Grammatical Inference: algorithms and applications
LearnPADS++: incremental inference of ad hoc data formats
PADL'12 Proceedings of the 14th international conference on Practical Aspects of Declarative Languages
Hi-index | 0.00 |
To facilitate effective search on the World Wide Web, meta search engines have been developed which do not search the Web themselves, but use available search engines to find the required information. By means of wrappers, meta search engines retrieve information from the pages returned by search engines. We present an approach to automatically create such wrappers by means of an incremental grammar induction algorithm. The algorithm uses an adaptation of the string edit distance. Our method performs well; it is quick, can be used for several types of result pages and requires a minimal amount of user interaction.