Probabilistic counting algorithms for data base applications
Journal of Computer and System Sciences
Index structures for selective dissemination of information under the Boolean model
ACM Transactions on Database Systems (TODS)
On-line new event detection and tracking
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
PREFER: a system for the efficient execution of multi-parametric ranked queries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Combining fuzzy information: an overview
ACM SIGMOD Record
Proceedings of the 17th International Conference on Data Engineering
What's hot and what's not: tracking most frequent items dynamically
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Finding frequent items in data streams
Theoretical Computer Science - Special issue on automata, languages and programming
On the Bursty Evolution of Blogspace
World Wide Web
Continuous monitoring of top-k queries over sliding windows
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Answering top-k queries using views
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Analyzing feature trajectories for event detection
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Ad-hoc top-k query answering for data streams
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Algorithms and Data Structures: The Basic Toolbox
Algorithms and Data Structures: The Basic Toolbox
Mining Frequent Itemsets in a Stream
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Sliding-window top-k queries on uncertain streams
Proceedings of the VLDB Endowment
Top-k aggregation using intersections of ranked inputs
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Efficient identification of starters and followers in social media
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
An Incremental Threshold Method for Continuous Text Search Queries
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
TwitterMonitor: trend detection over the twitter stream
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
The gist of everything new: personalized top-k processing over web 2.0 streams
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
EnBlogue: emergent topic detection in web 2.0 streams
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Trend detection in folksonomies
SAMT'06 Proceedings of the First international conference on Semantic and Digital Media Technologies
Exploiting temporal topic models in social media retrieval
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Processing continuous text queries featuring non-homogeneous scoring functions
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
Web 2.0 streams, like blog postings, micro-blogging tweets, or RSS feeds from online communities, offer a wealth of latest news about real-world events and societal discussion. From a user's perspective, it becomes harder and harder to get a decent overview of recent events, given these massive streams of information that are continuously flowing. Ideally, a system would continuously put together recent information, ranked by the current social impact but also weighted by the users' personal interests. In this work, we develop methods to meet these requirements. The presented approach continuously tracks the most popular tags attached to the incoming items and based on this, constructs a dynamic top-k query. By continuous evaluation of this query on the incoming stream, we are able to retrieve the currently hottest items. These hottest items are then fed into an engine that re-ranks them w.r.t. user specified interests, given in form of term based topic descriptions. This calls for high performance algorithms for efficient hot document retrieval and subsequently personalizing these documents based on user profiles, given the high rate of incoming data and the immense number of user profiles. In this work we present a combined solution, making use of our prior work on information filtering and showing how it can be used in combination with the current work, on how to continuously determine the hottest documents. To demonstrate the suitability of our approach, we perform a performance evaluation using a real-world dataset obtained from a weblog crawl.