The Debye Environment for Web Data Management

Authors:
Alberto H. F. Laender;Altigran S. da Silva;Paolo B. Golgher;Berthier Ribeiro-Neto;Irna M. R. Evangelista-Filha;Karine V. Magalhães
Affiliations:
-;-;-;-;-;-
Venue:
IEEE Internet Computing
Year:
2002

Citing 10
Cited 3

A recursive algebra and query optimization for nested relations

SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
Database techniques for the World-Wide Web: a survey

ACM SIGMOD Record
Storing semistructured data with STORED

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Conceptual-model-based data extraction from multiple-record Web pages

Data & Knowledge Engineering
Comparative analysis of five XML query languages

ACM SIGMOD Record
Bootstrapping for example-based data extraction

Proceedings of the tenth international conference on Information and knowledge management
A brief survey of web data extraction tools

ACM SIGMOD Record
DEByE - Date extraction by example

Data & Knowledge Engineering
RoadRunner: Towards Automatic Data Extraction from Large Web Sites

Proceedings of the 27th International Conference on Very Large Data Bases
Xyleme: A Dynamic Warehouse for XML Data of the Web

IDEAS '01 Proceedings of the International Database Engineering & Applications Symposium

Collecting hidden weeb pages for data extraction

Proceedings of the 4th international workshop on Web information and data management
L-tree match: a new data extraction model and algorithm for huge text stream with noises

Journal of Computer Science and Technology
Mashroom: end-user mashup programming using nested tables

Proceedings of the 18th international conference on World wide web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Currently, the Web contains a large amount of interesting data implicitlyavailable on pages at various sites, including digital libraries and on-line stores. Researchers regard these data-rich pages as "data containers," because they contain useful, semistructured data. Such data is not readily available through conventional Web search tools, however, as it is typically identifiable only indirectly through visual clues such as colors, fonts, bullets, and indentations. Further, the underlying flexibility of both the con-tent and format creates structural variations and irregularities that challenge traditional data management systems. Even though data structuring standards such as XML are likely to gain in popularity, that fact does not address the existing (and still growing) volume of semistructured Web data available, for instance, on HTML pages.