Discovering interesting information with advances in web technology

  • Authors:
  • Richi Nayak;Pierre Senellart;Fabian M. Suchanek;Aparna S. Varde

  • Affiliations:
  • Queensland University of Technology, Brisbane, Australia;Institut Mines--Té/lé/com/ Té/lé/com ParisTech/ CNRS LTCI, Paris, France;Max Planck Institute for Informatics, Saarbrü/cken, Germany;Montclair State University, NJ, USA

  • Venue:
  • ACM SIGKDD Explorations Newsletter
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Web is a steadily evolving resource comprising much more than mere HTML pages. With its ever-growing data sources in a variety of formats, it provides great potential for knowledge discovery. In this article, we shed light on some interesting phenomena of the Web: the deep Web, which surfaces database records as Web pages; the Semantic Web, which defines meaningful data exchange formats; XML, which has established itself as a lingua franca for Web data exchange; and domain-specific markup languages, which are designed based on XML syntax with the goal of preserving semantics in targeted domains. We detail these four developments in Web technology, and explain how they can be used for data mining. Our goal is to show that all these areas can be as useful for knowledge discovery as the HTML-based part of the Web.