Graph-Based Algorithms for Boolean Function Manipulation
IEEE Transactions on Computers
Modern Information Retrieval
Databases and Transaction Processing: An Application-Oriented Approach
Databases and Transaction Processing: An Application-Oriented Approach
Introduction to Algorithms
On the bursty evolution of blogspace
WWW '03 Proceedings of the 12th international conference on World Wide Web
Bursty and Hierarchical Structure in Streams
Data Mining and Knowledge Discovery
On lossy time decompositions of time stamped documents
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Discovering evolutionary theme patterns from text: an exploration of temporal text mining
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Parameter free bursty events detection in text streams
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Spatial scan statistics: approximations and performance study
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining correlated bursty topic patterns from coordinated text streams
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
IEEE Transactions on Computers
Text Mining through Entity-Relationship Based Information Extraction
WI-IATW '07 Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops
Fuzzy Clustering for Topic Analysis and Summarization of Document Collections
CAI '07 Proceedings of the 20th conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
Data & Knowledge Engineering
Anomalous window discovery through scan statistics for linear intersecting paths (SSLIP)
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining sequential patterns across multiple sequence databases
Data & Knowledge Engineering
Trends Analysis of Topics Based on Temporal Segmentation
DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Fuzzy Classification of Web Reports with Linguistic Text Mining
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Vlogging: A survey of videoblogging technology on the web
ACM Computing Surveys (CSUR)
From bursty patterns to bursty facts: The effectiveness of temporal text mining for news
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Data & Knowledge Engineering
Efficient algorithms for constructing time decompositions of time stamped documents
DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications
Event identification in web social media through named entity recognition and topic modeling
Data & Knowledge Engineering
Editorial: COMPENDIUM: A text summarization system for generating abstracts of research papers
Data & Knowledge Engineering
Hi-index | 0.00 |
Identifying time periods with a burst of activities related to a topic has been an important problem in analyzing time-stamped documents. In this paper, we propose an approach to extract a hot spot of a given topic in a time-stamped document set. Topics can be basic, containing a simple list of keywords, or complex. Logical relationships such as and, or, and not are used to build complex topics from basic topics. A concept of presence measure of a topic based on fuzzy set theory is introduced to compute the amount of information related to the topic in the document set. Each interval in the time period of the document set is associated with a numeric value which we call the discrepancy score. A high discrepancy score indicates that the documents in the time interval are more focused on the topic than those outside of the time interval. A hot spot of a given topic is defined as a time interval with the highest discrepancy score. We first describe a naive implementation for extracting hot spots. We then construct an algorithm called EHE (Efficient Hot Spot Extraction) using several efficient strategies to improve performance. We also introduce the notion of a topic DAG to facilitate an efficient computation of presence measures of complex topics. The proposed approach is illustrated by several experiments on a subset of the TDT-Pilot Corpus and DBLP conference data set. The experiments show that the proposed EHE algorithm significantly outperforms the naive one, and the extracted hot spots of given topics are meaningful.