Numerical recipes in C (2nd ed.): the art of scientific computing
Numerical recipes in C (2nd ed.): the art of scientific computing
Fast subsequence matching in time-series databases
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
OPTICS: ordering points to identify the clustering structure
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
The C++ standard library: a tutorial and reference
The C++ standard library: a tutorial and reference
ACM SIGKDD Explorations Newsletter
Modern Information Retrieval
Digital Image Processing
General match: a subsequence matching method in time-series databases based on generalized windows
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Efficient Similarity Search In Sequence Databases
FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Efficient Retrieval of Similar Time Sequences Under Time Warping
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Duality-Based Subsequence Matching in Time-Series Databases
Proceedings of the 17th International Conference on Data Engineering
Fast Time Sequence Indexing for Arbitrary Lp Norms
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Web Information Retrieval - an Algorithmic Perspective
ESA '00 Proceedings of the 8th Annual European Symposium on Algorithms
STING: A Statistical Information Grid Approach to Spatial Data Mining
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
On the need for time series data mining benchmarks: a survey and empirical demonstration
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Indexing multi-dimensional time-series with support for multiple distance measures
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Identifying similarities, periodicities and bursts for online search queries
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
AutoLag: Automatic Discovery of Lag Correlations in Stream Data
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Semantic similarity between search engine queries using temporal correlation
WWW '05 Proceedings of the 14th international conference on World Wide Web
Periodicity Detection in Time Series Databases
IEEE Transactions on Knowledge and Data Engineering
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Bregman Bubble Clustering: A Robust, Scalable Framework for Locating Multiple, Dense Regions in Data
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Clustering with Bregman Divergences
The Journal of Machine Learning Research
Digital Signal Processing (4th Edition)
Digital Signal Processing (4th Edition)
Scaling up all pairs similarity search
Proceedings of the 16th international conference on World Wide Web
Discovery of Periodic Patterns in Spatiotemporal Sequences
IEEE Transactions on Knowledge and Data Engineering
Evolutionary spectral clustering by incorporating temporal smoothness
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Exact indexing of dynamic time warping
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Summarizing Distributed Data Streams for Storage in Data Warehouses
DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Proceedings of the VLDB Endowment
Patterns of temporal variation in online media
Proceedings of the fourth ACM international conference on Web search and data mining
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Speeding-Up hierarchical agglomerative clustering in presence of expensive metrics
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Information Sciences: an International Journal
Mining stable patterns in multiple correlated databases
Decision Support Systems
Hi-index | 0.07 |
Huge amounts of various web items (e.g., images, keywords, and web pages) are being made available on the Web. The popularity of such web items continuously changes over time, and mining for temporal patterns in the popularity of web items is an important problem that is useful for several Web applications; for example, the temporal patterns in the popularity of web search keywords help web search enterprises predict future popular keywords, thus enabling them to make price decisions when marketing search keywords to advertisers. However, the presence of millions of web items makes it difficult to scale up previous techniques for this problem. This paper proposes an efficient method for mining temporal patterns in the popularity of web items. We treat the popularity of web items as time-series and propose a novel measure, a gap measure, to quantify the dissimilarity between the popularity of two web items. To reduce the computational overhead for this measure, an efficient method using the Discrete Fourier Transform (DFT) is presented. We assume that the popularity of web items is not necessarily periodic. For finding clusters of web items with similar popularity trends, we show the limitations of traditional clustering approaches and propose a scalable, efficient, density-based clustering algorithm using the gap measure. Our experiments using the popularity trends of web search keywords obtained from the Google Trends web site illustrate the scalability and usefulness of the proposed approach in real-world applications.