Extraction of interesting financial information from heterogeneous XML-Based data

  • Authors:
  • Juryon Paik;Young Ik Eom;Ung Mo Kim

  • Affiliations:
  • Department of Computer Engineering, Sungkyunkwan University, Gyeonggi-do, Republic of Korea;Department of Computer Engineering, Sungkyunkwan University, Gyeonggi-do, Republic of Korea;Department of Computer Engineering, Sungkyunkwan University, Gyeonggi-do, Republic of Korea

  • Venue:
  • ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part IV
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

XML is going to be the main language for exchanging financial information between businesses over the Internet. As more and more banks and financial institutions move to electronic information exchange and reporting, the financial world is in a flood of information. With the sheer amount of financial information stored, presented and exchanged using XML-based standards, the ability to extract interesting knowledge from the data sources to better understand customer buying/selling behaviors and upward/downward trends in the stock market becomes increasingly important and desirable. Hence, there have been growing demands for efficient methods of discovering valuable information from a large collection of XML-based data. One of the most popular approaches to find the useful information is to mine frequently occurring tree patterns. In this paper, we propose a novel algorithm, FIXiT,for efficiently extracting maximal frequent subtrees from a set of XML-based documents. The main contributions of our algorithm are that: (1) it classifies the available financial XML standards such as FIXML, FpML, XBRL, and so forth with respect to their specifications, and (2) there is no need to perform tree join operations during the phase of generating maximal frequent subtrees.