Mine your own business, mine others' news!

Authors:
Quang-Khai Pham;Regis Saint-Paul;Boualem Benatallah;Noureddine Mouaddib;Guillaume Raschia
Affiliations:
University of New South Wales, Sydney, NSW, Australia and LINA at University of Nantes, Nantes, France;University of New South Wales, Sydney, NSW, Australia;University of New South Wales, Sydney, NSW, Australia;LINA at University of Nantes, Nantes, France;LINA at University of Nantes, Nantes, France
Venue:
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Year:
2008

Citing 7
Cited 0

Attribute-oriented induction in data mining

Advances in knowledge discovery and data mining
SPARTAN: a model-based semantic compression system for massive data tables

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth

Proceedings of the 17th International Conference on Data Engineering
Semantic Compression and Pattern Extraction with Fascicles

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
ItCompress: An Iterative Semantic Compression Algorithm

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
General purpose database summarization

VLDB '05 Proceedings of the 31st international conference on Very large data bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Major media companies such as The Financial Times, the Wall Street Journal or Reuters generate huge amounts of textual news data on a daily basis. Mining frequent patterns in this mass of information is critical for knowledge workers such as financial analysts, stock traders or economists. Using existing frequent pattern mining (FPM) algorithms for the analysis of news data is difficult because of the size and lack of structuring of the free text news content. In this article, we demonstrate a comprehensive Streaming TEmporAl Data (STEAD) analysis framework for mining frequent patterns in financial news. In this demonstration, we show how the mining task is supported by the use of a Time-Aware Content Summarization algorithm (TACS). This summary generates a concise representation of large volume of data by taking into account the expert's peculiar interest while preserving the news arrival temporal information which is essential for FPM algorithms. We experimented the whole framework on a set of news data from Reuters.