A recursive algebra and query optimization for nested relations
SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
Database techniques for the World-Wide Web: a survey
ACM SIGMOD Record
Storing semistructured data with STORED
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Conceptual-model-based data extraction from multiple-record Web pages
Data & Knowledge Engineering
Comparative analysis of five XML query languages
ACM SIGMOD Record
Bootstrapping for example-based data extraction
Proceedings of the tenth international conference on Information and knowledge management
A brief survey of web data extraction tools
ACM SIGMOD Record
DEByE - Date extraction by example
Data & Knowledge Engineering
RoadRunner: Towards Automatic Data Extraction from Large Web Sites
Proceedings of the 27th International Conference on Very Large Data Bases
Xyleme: A Dynamic Warehouse for XML Data of the Web
IDEAS '01 Proceedings of the International Database Engineering & Applications Symposium
Collecting hidden weeb pages for data extraction
Proceedings of the 4th international workshop on Web information and data management
L-tree match: a new data extraction model and algorithm for huge text stream with noises
Journal of Computer Science and Technology
Mashroom: end-user mashup programming using nested tables
Proceedings of the 18th international conference on World wide web
Hi-index | 0.00 |
Currently, the Web contains a large amount of interesting data implicitlyavailable on pages at various sites, including digital libraries and on-line stores. Researchers regard these data-rich pages as "data containers," because they contain useful, semistructured data. Such data is not readily available through conventional Web search tools, however, as it is typically identifiable only indirectly through visual clues such as colors, fonts, bullets, and indentations. Further, the underlying flexibility of both the con-tent and format creates structural variations and irregularities that challenge traditional data management systems. Even though data structuring standards such as XML are likely to gain in popularity, that fact does not address the existing (and still growing) volume of semistructured Web data available, for instance, on HTML pages.