Mining temporal patterns in popularity of web items

Authors:
Woong-Kee Loh;Sandeep Mane;Jaideep Srivastava
Affiliations:
Department of Multimedia, Sungkyul University, Republic of Korea;Department of Computer Science & Engineering, University of Minnesota, USA;Department of Computer Science & Engineering, University of Minnesota, USA
Venue:
Information Sciences: an International Journal
Year:
2011

Citing 42
Cited 2

Numerical recipes in C (2nd ed.): the art of scientific computing

Numerical recipes in C (2nd ed.): the art of scientific computing
Fast subsequence matching in time-series databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
OPTICS: ordering points to identify the clustering structure

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
The C++ standard library: a tutorial and reference

The C++ standard library: a tutorial and reference
Web mining research: a survey

ACM SIGKDD Explorations Newsletter
Modern Information Retrieval

Modern Information Retrieval
Digital Image Processing

Digital Image Processing
General match: a subsequence matching method in time-series databases based on generalized windows

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Chameleon: Hierarchical Clustering Using Dynamic Modeling

Computer
Efficient Similarity Search In Sequence Databases

FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Efficient Retrieval of Similar Time Sequences Under Time Warping

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Duality-Based Subsequence Matching in Time-Series Databases

Proceedings of the 17th International Conference on Data Engineering
Fast Time Sequence Indexing for Arbitrary Lp Norms

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Web Information Retrieval - an Algorithmic Perspective

ESA '00 Proceedings of the 8th Annual European Symposium on Algorithms
STING: A Statistical Information Grid Approach to Spatial Data Mining

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
On the need for time series data mining benchmarks: a survey and empirical demonstration

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Indexing multi-dimensional time-series with support for multiple distance measures

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Identifying similarities, periodicities and bursts for online search queries

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
AutoLag: Automatic Discovery of Lag Correlations in Stream Data

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
A Survey of Eigenvector Methods for Web Information Retrieval

SIAM Review
Semantic similarity between search engine queries using temporal correlation

WWW '05 Proceedings of the 14th international conference on World Wide Web
Periodicity Detection in Time Series Databases

IEEE Transactions on Knowledge and Data Engineering
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Evolutionary clustering

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
LB_Keogh supports exact indexing of shapes under rotation invariance with arbitrary representations and distance measures

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Bregman Bubble Clustering: A Robust, Scalable Framework for Locating Multiple, Dense Regions in Data

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Clustering with Bregman Divergences

The Journal of Machine Learning Research
Digital Signal Processing (4th Edition)

Digital Signal Processing (4th Edition)
Scaling up all pairs similarity search

Proceedings of the 16th international conference on World Wide Web
Discovery of Periodic Patterns in Spatiotemporal Sequences

IEEE Transactions on Knowledge and Data Engineering
Evolutionary spectral clustering by incorporating temporal smoothness

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Exact indexing of dynamic time warping

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Summarizing Distributed Data Streams for Storage in Data Warehouses

DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Querying and mining of time series data: experimental comparison of representations and distance measures

Proceedings of the VLDB Endowment
Patterns of temporal variation in online media

Proceedings of the fourth ACM international conference on Web search and data mining
A MPAA-Based iterative clustering algorithm augmented by nearest neighbors search for time-series data streams

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Speeding-Up hierarchical agglomerative clustering in presence of expensive metrics

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining

Frequency-based similarity for parameterized sequences: Formal framework, algorithms, and applications

Information Sciences: an International Journal
Mining stable patterns in multiple correlated databases

Decision Support Systems

Quantified Score

Hi-index	0.07

Visualization

Abstract

Huge amounts of various web items (e.g., images, keywords, and web pages) are being made available on the Web. The popularity of such web items continuously changes over time, and mining for temporal patterns in the popularity of web items is an important problem that is useful for several Web applications; for example, the temporal patterns in the popularity of web search keywords help web search enterprises predict future popular keywords, thus enabling them to make price decisions when marketing search keywords to advertisers. However, the presence of millions of web items makes it difficult to scale up previous techniques for this problem. This paper proposes an efficient method for mining temporal patterns in the popularity of web items. We treat the popularity of web items as time-series and propose a novel measure, a gap measure, to quantify the dissimilarity between the popularity of two web items. To reduce the computational overhead for this measure, an efficient method using the Discrete Fourier Transform (DFT) is presented. We assume that the popularity of web items is not necessarily periodic. For finding clusters of web items with similar popularity trends, we show the limitations of traditional clustering approaches and propose a scalable, efficient, density-based clustering algorithm using the gap measure. Our experiments using the popularity trends of web search keywords obtained from the Google Trends web site illustrate the scalability and usefulness of the proposed approach in real-world applications.