P-top-k queries in a probabilistic framework from information extraction models

  • Authors:
  • Ming He;Yong-Ping Du

  • Affiliations:
  • -;-

  • Venue:
  • Computers & Mathematics with Applications
  • Year:
  • 2011

Quantified Score

Hi-index 0.09

Visualization

Abstract

Many applications today need to manage uncertain data, such as information extraction (IE), data integration, sensor RFID networks, and scientific experiments. Top-k queries are often natural and useful in analyzing uncertain data in those applications. In this paper, we study the problem of answering top-k queries in a probabilistic framework from a state-of-the-art statistical IE model-semi-conditional random fields (CRFs)-in the setting of probabilistic databases that treat statistical models as first-class data objects. We investigate the problem of ranking the answers to probabilistic database queries. We present an efficient algorithm for finding the best approximating parameters in such a framework for efficiently retrieving the top-k ranked results. An empirical study using real data sets demonstrates the effectiveness of probabilistic top-k queries and the efficiency of our method.