Web data management

Authors:
Michael J. Cafarella;Alon Y. Halevy
Affiliations:
University of Michigan, Ann Arbor, MI, USA;Google, Inc., Mountain View, CA, USA
Venue:
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Year:
2011

Citing 11
Cited 1

Accessing the deep web

Communications of the ACM - ACM at sixty: a look back in time
Yago: a core of semantic knowledge

Proceedings of the 16th international conference on World Wide Web
ManyEyes: a Site for Visualization at Internet Scale

IEEE Transactions on Visualization and Computer Graphics
Autonomously semantifying wikipedia

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Freebase: a collaboratively created graph database for structuring human knowledge

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
WebTables: exploring the power of tables on the web

Proceedings of the VLDB Endowment
Open information extraction from the web

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
DBpedia - A crystallization point for the Web of Data

Web Semantics: Science, Services and Agents on the World Wide Web
Unsupervised named-entity extraction from the Web: An experimental study

Artificial Intelligence
Google fusion tables: data management, integration and collaboration in the cloud

Proceedings of the 1st ACM symposium on Cloud computing
Annotating and searching web tables using entities, types and relationships

Proceedings of the VLDB Endowment

Ontology-based structured web data warehouses for sustainable interoperability: requirement modeling, design methodology and tool

Computers in Industry

Quantified Score

Hi-index	0.00

Visualization

Abstract

Web Data Management (or WDM) refers to a body of work concerned with leveraging the large collections of structured data that can be extracted from the Web. Over the past few years, several research and commercial efforts have explored these collections of data with the goal of improving Web search and developing mechanisms for surfacing different kinds of search answers. This work has leveraged (1) collections of structured data such as HTML tables, lists and forms, (2) recent ontologies and knowledge bases created by crowd-sourcing, such as Wikipedia and its derivatives, DBPedia, YAGO and Freebase, and (3) the collection of text documents from the Web, from which facts could be extracted in a domain-independent fashion. The promise of this line of work is based on the observation that new kinds of results can be obtained by leveraging a huge collection of independently created fragments of data, and typically in ways that are wholly unrelated to the authors' original intent. For example, we might use many database schemas to compute a schema thesaurus. Or we might examine many spreadsheets of scientific data that reveal the aggregate practice of an entire scientific field. As such, WDM is tightly linked to Web-enabled collaboration, even (or especially) if the collaborators are unwitting ones. We will cover the key techniques, principles and insights obtained so far in the area of Web Data Management.