Mining Impact-Targeted Activity Patterns in Imbalanced Data

  • Authors:
  • Longbing Cao;Yanchang Zhao;Chengqi Zhang

  • Affiliations:
  • IEEE;IEEE;IEEE

  • Venue:
  • IEEE Transactions on Knowledge and Data Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Impact-targeted activities are rare but lead to significant impact on the society, e.g., isolated terrorism activities may lead to a disastrous event threatening national security. Similar issues can also be seen in many other areas. Therefore, it is important to identify such particular activities before they lead to significant impact to the world. However, it is challenging to mine impact-targeted activity patterns due to its imbalanced structure. This paper develops techniques for discovering such activity patterns. First, the complexities of mining imbalanced impact-targeted activities are analyzed.We then discuss strategies for constructing impact-targeted activity sequences. Algorithms are developed to mine frequent positive-impact (P → T) and negative-impact (P → $(\bar{T})$) oriented activity patterns, sequential impact-contrasted activity patterns (P is frequently associated with both pattern P → T and P → $(\bar{T})$) in separated data sets), and sequential impact-reversed activity patterns (both P → T and PQ → $(\bar{T})$) are frequent). Activity impact modelling is also studied to quantify pattern impact on business outcomes. Social security debt-related activity data is used to test the proposed approaches. The outcomes show that they are promising for ISI applications to identify impact-targeted activity patterns in imbalanced data.