Extracting significant time varying features from text

Authors:
Russell Swan;James Allan
Affiliations:
Center for Intelligent Information Retrieval, Department of Computer Science, University of Massachusetts, Amherst, Massachusetts;Center for Intelligent Information Retrieval, Department of Computer Science, University of Massachusetts, Amherst, Massachusetts
Venue:
Proceedings of the eighth international conference on Information and knowledge management
Year:
1999

Citing 5
Cited 39

Modelling documents with multiple Poisson distributions

Information Processing and Management: an International Journal
LifeLines: visualizing personal histories

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Deriving concept hierarchies from text

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
The Design and Implementation of a Part of Speech Tagger for English

The Design and Implementation of a Part of Speech Tagger for English
Description of the UMass system as used for MUC-6

MUC6 '95 Proceedings of the 6th conference on Message understanding

Automatic generation of overview timelines

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
TimeMine (demonstration session): visualizing automatically constructed timelines

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Mining from open answers in questionnaire data

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Detecting events with date and place information in unstructured text

Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Detecting and Browsing Events in Unstructured text

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Mining Open Answers in Questionnaire Data

IEEE Intelligent Systems
Extracting Temporal References to Assign Document Event-Time Periods

DEXA '01 Proceedings of the 12th International Conference on Database and Expert Systems Applications
Introduction to topic detection and tracking

Topic detection and tracking
Explorations within topic tracking and detection

Topic detection and tracking
Bursty and hierarchical structure in streams

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Bursty and Hierarchical Structure in Streams

Data Mining and Knowledge Discovery
Simple Semantics in Topic Detection and Tracking

Information Retrieval
"In vivo" spam filtering: a challenge problem for KDD

ACM SIGKDD Explorations Newsletter
Query based event extraction along a timeline

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Tracking dynamics of topic trends using a finite mixture model

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
On lossy time decompositions of time stamped documents

Proceedings of the thirteenth ACM international conference on Information and knowledge management
An evaluation corpus for temporal summarization

HLT '01 Proceedings of the first international conference on Human language technology research
Parameter free bursty events detection in text streams

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Information Preserving Time Decompositions of Time Stamped Documents*

Data Mining and Knowledge Discovery
A comparison of feature selection methods for an evolving RSS feed corpus

Information Processing and Management: an International Journal - Special issue: Informetrics
Hot Topic Extraction Based on Timeline Analysis and Multidimensional Sentence Modeling

IEEE Transactions on Knowledge and Data Engineering
Time-dependent event hierarchy construction

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining correlated bursty topic patterns from coordinated text streams

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Storyline-based summarization for news topic retrospection

Decision Support Systems
Discovering Trends in Collaborative Tagging Systems

PAISI, PACCF and SOCO '08 Proceedings of the IEEE ISI 2008 PAISI, PACCF, and SOCO international workshops on Intelligence and Security Informatics
Improving Temporal Language Models for Determining Time of Non-timestamped Documents

ECDL '08 Proceedings of the 12th European conference on Research and Advanced Technology for Digital Libraries
Automatic online news topic ranking using media focus and user attention based on aging theory

Proceedings of the 17th ACM conference on Information and knowledge management
Extracting Key Entities and Significant Events from Online Daily News

IDEAL '08 Proceedings of the 9th International Conference on Intelligent Data Engineering and Automated Learning
Combining named entities and tags for novel sentence detection

Proceedings of the WSDM '09 Workshop on Exploiting Semantic Annotations in Information Retrieval
Burst detection from multiple data streams: a network-based approach

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Topic dynamics: an alternative model of bursts in streams of topics

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Twitinfo: aggregating and visualizing microblogs for event exploration

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Revealing Associations between Events and Their Characteristic Items

WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Efficient algorithms for constructing time decompositions of time stamped documents

DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications
Towards automatic detection and tracking of topic change

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
In & out zooming on time-aware user/tag clusters

Journal of Intelligent Information Systems
An n-gram topic model for time-stamped documents

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
On-line relevant anomaly detection in the Twitter stream: an efficient bursty keyword detection model

Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description
Leveraging microblogging big data with a modified density-based clustering approach for event awareness and topic ranking

Journal of Information Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a simple statistical model for the frequency of occurrence of features in a stream of text. Adoption of this model allows us to use classical significance tests to filter the stream for interesting events. We tested the model by building a system and running it on a news corpus. By a subjective evaluation, the system worked remarkably well: almost all of the groups of identified tokens corresponded to news stories and were appropriately placed in time. A preliminary objective evaluation was also used to measure the quality of the system and it showed some of the weaknesses and the power of our approach.