Getting critical categories of a data set

  • Authors:
  • Cheqing Jin;Yizhen Zhang;Aoying Zhou

  • Affiliations:
  • Shanghai Key Laboratory of Trustworthy Computing, Software Engineering Institute, East China Normal University, China;Shanghai Key Laboratory of Trustworthy Computing, Software Engineering Institute, East China Normal University, China;Shanghai Key Laboratory of Trustworthy Computing, Software Engineering Institute, East China Normal University, China

  • Venue:
  • WAIM'11 Proceedings of the 12th international conference on Web-age information management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Ranking query that is widely used in various applications is a fundamental kind of queries in the database management field. Although most of the existing work on ranking query focuses on getting top-k high-score tuples from a data set, this paper focuses on getting top-k critical categories from a data set, where each category is a data item in the nominal attribute or a combination of data items from more than one nominal attribute. To describe each category precisely, we use a data distribution that comes from the score attribute to represent each category, so that the set consisting of all categories can be treated as a probabilistic data set. In this paper, we devise a novel method to handle this issue. Analysis in theorem and experimental results show the effectiveness and efficiency of the proposed method.