Generalized regression model for sequence matching and clustering

  • Authors:
  • Hansheng Lei;Venu Govindaraju

  • Affiliations:
  • University of Texas at Brownsville, Department of Computer Science and Computer Information Systems, 78520, Brownsville, TX, USA;The State University of New York at Buffalo, Govindaraju Computer Science and Engineering Department, 78520, Amherst, NY, USA

  • Venue:
  • Knowledge and Information Systems
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Linear relation has been found to be valuable in rule discovery of stocks, such as if stock X goes up a, stock Y will go down b. The traditional linear regression models the linear relation of two sequences faithfully. However, if a user requires clustering of stocks into groups where sequences have high linearity or similarity with each other, it is prohibitively expensive to compare sequences one by one. In this paper, we present generalized regression model (GRM) to match the linearity of multiple sequences at a time. GRM also gives strong heuristic support for graceful and efficient clustering. The experiments on the stocks in the NASDAQ market mined interesting clusters of stock trends efficiently.