Database researchers: plumbers or thinkers?

  • Authors:
  • Gerhard Weikum

  • Affiliations:
  • Max Planck Institute for Informatics, Saarbrücken, Germany

  • Venue:
  • Proceedings of the 14th International Conference on Extending Database Technology
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

DB researchers have traditionally focused on engine-centered issues such as indexing, query processing, and transactions. Data mining has broadened the community's viewpoint towards algorithmic and statistical issues. However, DB research has always had a tendency to shy away from seemingly elusive long-term challenges with AI flavor. On the other hand, the current explosion of digital content in enterprises and the Internet, is mostly caused by user-created information like text, tags, photos, videos, and not by seeing more well-designed databases of the traditional kind. In this situation, I question the traditional skepticism of DB researchers towards "AI-complete" problems and the DB community's reluctance to embark on seemingly non-DB-ish grand challenges. Big questions that I see as great opportunities also for DB research include: 1) automatic extraction of relational facts from natural-language text and multimodal contexts [4, 6, 21], 2) automatic disambiguation of named-entity mentions and general phrases in text and speech [10, 11], 3) large-scale gathering of factual-knowledge candidates and their reconciliation into comprehensive knowledge bases [1, 2, 8, 13, 19], 4) reasoning on uncertain hypotheses, for knowledge discovery and semantic search [9, 14, 16, 17, 20], 5) deep and real-time question answering, e.g., to enable computers to win quiz game shows [7], 6) machine-reading of scientific publications and fictional literature, to enable corpus-wide analyses and enable researchers in science and humanities to develop hypotheses and quickly focus on the most relevant issues [3, 5]. I believe that successfully tackling these topics requires efficient data-centric algorithms, scalable methods and architectures, and system-level thinking - virtues that are richly available in the DB research community. Moreover, I would encourage our community to look across the fence and get more engaged on the exciting challenges outside the traditionally narrow boundaries of the DB realm. I will illustrate these points by examples from my own research on knowledge management [12, 15, 18, 19]. Breakthroughs will require long-term stamina. In the meantime, steady incremental progress is better than not embarking on these important problems at all.