Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
A study of retrospective and on-line event detection
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
On-line new event detection and tracking
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Co-clustering documents and words using bipartite spectral graph partitioning
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Topic Extraction from News Archive Using TF*PDF Algorithm
WISE '02 Proceedings of the 3rd International Conference on Web Information Systems Engineering
Hierarchical model-based clustering of large datasets through fractionation and refractionation
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
KeyGraph: Automatic Indexing by Co-occurrence Graph based on Building Construction Metaphor
ADL '98 Proceedings of the Advances in Digital Libraries Conference
A System for new event detection
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
The Journal of Machine Learning Research
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
A probabilistic model for retrospective news event detection
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
A search result clustering method using informatively named entities
Proceedings of the 7th annual ACM international workshop on Web information and data management
Topic analysis using a finite mixture model
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Topics over time: a non-Markov continuous-time model of topical trends
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Topic Detection and Tracking for News Web Pages
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Analyzing feature trajectories for event detection
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Topic Detection by Clustering Keywords
DEXA '08 Proceedings of the 2008 19th International Conference on Database and Expert Systems Application
Using Burstiness to Improve Clustering of Topics in News Streams
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
A Sparsification Approach for Temporal Graphical Model Decomposition
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Learning similarity metrics for event identification in social media
Proceedings of the third ACM international conference on Web search and data mining
On smoothing and inference for topic models
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Emerging topic detection on Twitter based on temporal and social terms evaluation
Proceedings of the Tenth International Workshop on Multimedia Data Mining
Automatic evaluation of topic coherence
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Hi-index | 0.00 |
Topic detection with large and noisy data collections such as social media must address both scalability and accuracy challenges. KeyGraph is an efficient method that improves on current solutions by considering keyword cooccurrence. We show that KeyGraph has similar accuracy when compared to state-of-the-art approaches on small, well-annotated collections, and it can successfully filter irrelevant documents and identify events in large and noisy social media collections. An extensive evaluation using Amazon’s Mechanical Turk demonstrated the increased accuracy and high precision of KeyGraph, as well as superior runtime performance compared to other solutions.