Passage-level evidence in document retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Communications of the ACM
Template-based wrappers in the TSIMMIS system
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
TINTIN: a system for retrieval in text tables
DL '97 Proceedings of the second ACM international conference on Digital libraries
Wrapper generation for semi-structured Internet sources
ACM SIGMOD Record
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Ontology-based extraction and structuring of information from data-rich unstructured documents
Proceedings of the seventh international conference on Information and knowledge management
Modern Information Retrieval
Wrapper Generation for Web Accessible Data Sources
COOPIS '98 Proceedings of the 3rd IFCIS International Conference on Cooperative Information Systems
A Conceptual-Modeling Approach to Extracting Data from the Web
ER '98 Proceedings of the 17th International Conference on Conceptual Modeling
Top-Down Extraction of Semi-Structured Data
SPIRE '99 Proceedings of the String Processing and Information Retrieval Symposium & International Workshop on Groupware
Computational aspects of resilient data extraction from semistructured sources (extended abstract)
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Learning to extract hierarchical information from semi-structured documents
Proceedings of the ninth international conference on Information and knowledge management
Bootstrapping for example-based data extraction
Proceedings of the tenth international conference on Information and knowledge management
A brief survey of web data extraction tools
ACM SIGMOD Record
DEByE - Date extraction by example
Data & Knowledge Engineering
The Elog Web Extraction Language
LPAR '01 Proceedings of the Artificial Intelligence on Logic for Programming
RoadRunner: Towards Automatic Data Extraction from Large Web Sites
Proceedings of the 27th International Conference on Very Large Data Bases
Representing Web Data as Complex Objects
EC-WEB '00 Proceedings of the First International Conference on Electronic Commerce and Web Technologies
Managing Web Data through Views
EC-Web 2001 Proceedings of the Second International Conference on Electronic Commerce and Web Technologies
A Framework for Generating Attribute Extractors for Web Data Sources
SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
An Example-Based Environment for Wrapper Generation
ER '00 Proceedings of the Workshops on Conceptual Modeling Approaches for E-Business and The World Wide Web and Conceptual Modeling: Conceptual Modeling for E-Business and the Web
Data extraction and label assignment for web databases
WWW '03 Proceedings of the 12th international conference on World Wide Web
A bag of paths model for measuring structural similarity in Web documents
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Meme media architectures for re-editing and redistributing intellectual assets over the web
International Journal of Human-Computer Studies - Special issue on HCI research in Japan
Automatic information extraction from large websites
Journal of the ACM (JACM)
A Survey of Web Information Extraction Systems
IEEE Transactions on Knowledge and Data Engineering
Scalable web data extraction for online market intelligence
Proceedings of the VLDB Endowment
An information extraction approach to reorganizing and summarizing specifications
Information and Software Technology
A simhash-based scheme for locating product information from the web
Proceedings of the Second Symposium on Information and Communication Technology
EPIA'05 Proceedings of the 12th Portuguese conference on Progress in Artificial Intelligence
Information extraction for the semantic web
Proceedings of the First international conference on Reasoning Web
Meme media architecture for intuitively accessing and organizing intellectual resources
IHI'04 Proceedings of the 2004 international conference on Intuitive Human Interfaces for Organizing and Accessing Intellectual Assets
The HiLeX system for semantic information extraction
Transactions on Large-Scale Data- and Knowledge-Centered Systems V
Hi-index | 0.00 |
In this paper, we describe an innovative approach to extracting semi-structured data from Web sources. The idea is to collect a couple of example objects from the user and to use this information to extract new objects from new pages or texts. To perform the extraction of new objects, we introduce a bottom-up extration strategy and, through experimentation, demonstrate that it works quite effectively with distinct Web sources, even if only a few examples are provided by the user.