The Markov-modulated Poisson process (MMPP) cookbook
Performance Evaluation
Automatic generation of overview timelines
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Bursty and hierarchical structure in streams
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient elastic burst detection in data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Computational Linguistics
Semantic similarity between search engine queries using temporal correlation
WWW '05 Proceedings of the 14th international conference on World Wide Web
Mining comparable bilingual text corpora for cross-language information integration
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Machine transliteration of names in Arabic text
SEMITIC '02 Proceedings of the ACL-02 workshop on Computational approaches to semitic languages
Data association for topic intensity tracking
ICML '06 Proceedings of the 23rd international conference on Machine learning
Named entity discovery using comparable news articles
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Mining correlated bursty topic patterns from coordinated text streams
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A phonetic similarity model for automatic extraction of transliteration pairs
ACM Transactions on Asian Language Information Processing (TALIP)
Boolean representation based data-adaptive correlation analysis over time series streams
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Scalable and near real-time burst detection from eCommerce queries
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining common topics from multiple asynchronous text streams
Proceedings of the Second ACM International Conference on Web Search and Data Mining
IEEE Transactions on Information Theory
Identifying event-related bursts via social media activities
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
A significance-driven framework for characterizing and finding evolving patterns of news networks
AICI'12 Proceedings of the 4th international conference on Artificial Intelligence and Computational Intelligence
Bursty subgraphs in social networks
Proceedings of the sixth ACM international conference on Web search and data mining
Emerging topic detection for organizations from microblogs
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Chelsea won, and you bought a t-shirt: characterizing the interplay between Twitter and e-commerce
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Hi-index | 0.00 |
In this work, we study a new text mining problem of discovering named entities with temporally correlated bursts of mention counts in multiple multilingual Web news streams. Mining named entities with temporally correlated bursts of mention counts in multilingual text streams has many interesting and important applications, such as identification of the latent events, attracting the attention of on-line media in different countries, and valuable linguistic knowledge in the form of transliterations. While mining "bursty" terms in a single text stream has been studied before, the problem of detecting terms with temporally correlated bursts in multilingual Web streams raises two new challenges: (i) correlated terms in multiple streams may have bursts that are of different orders of magnitude in their intensity and (ii) bursts of correlated terms may be separated by time gaps. We propose a two-stage method for mining items with temporally correlated bursts from multiple data streams, which addresses both challenges. In the first stage of the method, the temporal behavior of different entities is normalized by modeling them with the Markov-Modulated Poisson Process. In the second stage, a dynamic programming algorithm is used to discover correlated bursts of different items, that can be potentially separated by time gaps. We evaluated our method with the task of discovering transliterations of named entities from multilingual Web news streams. Experimental results indicate that our method can not only effectively discover named entities with correlated bursts in multilingual Web news streams, but also outperforms two state-of-the-art baseline methods for unsupervised discovery of transliterations in static text collections.