Space-efficient online computation of quantile summaries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Finding Frequent Items in Data Streams
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Finding Repeated Elements
Medians and beyond: new aggregation techniques for sensor networks
SenSys '04 Proceedings of the 2nd international conference on Embedded networked sensor systems
Approximate counts and quantiles over sliding windows
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A simpler and more efficient deterministic scheme for finding frequent items over sliding windows
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Methods for finding frequent items in data streams
The VLDB Journal — The International Journal on Very Large Data Bases
Efficient computation of frequent and top-k elements in data streams
ICDT'05 Proceedings of the 10th international conference on Database Theory
Hi-index | 0.03 |
We study the problem of efficient discovery of trending phrases from high-volume text streams -- be they sequences of Twitter messages, email messages, news articles, or other time-stamped text documents. Most existing approaches return top-k trending phrases. But, this approach neither guarantees that the top-k phrases returned are all trending, nor that all trending phrases are returned. In addition, the value of k is difficult to set and is indifferent to stream dynamics. Hence, we propose an approach that identifies all the trending phrases in a stream and is flexible to the changing stream properties.