Extracting dimensions for OLAP on multidimensional text databases

  • Authors:
  • Chao Zhang;Xinjun Wang;Zhaohui Peng

  • Affiliations:
  • School of Computer Science and Technology, Shandong University, Jinan, China;School of Computer Science and Technology, Shandong University, Jinan, China;School of Computer Science and Technology, Shandong University, Jinan, China

  • Venue:
  • WISM'11 Proceedings of the 2011 international conference on Web information systems and mining - Volume Part II
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the amount of textual information massively growing in various kinds of business systems and Internet, there are increasingly demands for analyzing both structured data and unstructured text data. Online Analysis Processing (OLAP) is effective for analyzing and mining structured data. However, while handling with unstructured data, it is powerless. After working on several information integration and data analysis applications, we have realized the defect of OLAP on text data analysis and use technical ways to handle this issue. In this paper, we propose a semi-supervised algorithm to extract dimensions and their members from textual information for the purpose of analyzing a huge set of textual data. We use straightforward measures to express analysis results. Experiment result shows that the extracting algorithm is valid and our approach has a high scalability and flexibility.