ESSQL: an enhanced semi-structured query language for composite document retrievals
Proceedings of the 16th annual international conference on Computer documentation
Rapper: a wrapper generator with linguistic knowledge
Proceedings of the 2nd international workshop on Web information and data management
Data integration using similarity joins and a word-based information representation language
ACM Transactions on Information Systems (TOIS)
Access to heterogeneous data sources for supporting business process execution
Proceedings of the 2001 ACM symposium on Applied computing
Advanced XML data processing: guest editor's introduction
ACM SIGMOD Record
Analysis of Document Structures for Element Type Classification
PODDP '98 Proceedings of the 4th International Workshop on Principles of Digital Document Processing
Weakly Constraining Multimedia Types Based on a Type Embedding Ordering
MIS '98 Proceedings of the 4th International Workshop on Advances in Multimedia Information Systems
Structured Web Pages Management for Efficient Data Retrieval
WISE '00 Proceedings of the First International Conference on Web Information Systems Engineering (WISE'00)-Volume 2 - Volume 2
Hi-index | 0.00 |
A huge amount of data is available today on the Internet, or on the private Intranets of many companies. This data is structured in a multitude of ways. At an extreme we find data coming from traditional relational or object-oriented databases, with a completely known structure. At another extreme we have data which is fully unstructured, such as images, sounds, and raw text. But most of the data falls somewhere in between these two extremes, for a variety of reasons: the data may be structured, but the structure is not know to the user; the user may know the structure, but chooses to ignore it, for browsing purposes; the structure may be implicit, such as in formatted text, and is not as rigid and regular as in traditional databases; the data may be in non-traditional formats, such as the ASN.1 exchange format; the schema of the data is huge and changes often, so that we may prefer to ignore it. Several researchers have worked recently on problems related to data fitting this description, and have coined the term semistructured data for it. Two recent tutorials [Abi97, Bun97] contain an excellent introduction to semistructured data and a comprehensive bibliography on this new research topic.