Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
An evaluation of text analysis technologies
AI Magazine
The CIRCUS System as Used in MUC-3
The CIRCUS System as Used in MUC-3
Overview of the third message understanding evaluation and conference
MUC3 '91 Proceedings of the 3rd conference on Message understanding
GE NLTooLSET: MUC-3 test results and analysis
MUC3 '91 Proceedings of the 3rd conference on Message understanding
University of Massachusetts: MUC-3 test results and analysis
MUC3 '91 Proceedings of the 3rd conference on Message understanding
University of Massachusetts: description of the CIRCUS system as used for MUC-3
MUC3 '91 Proceedings of the 3rd conference on Message understanding
Hi-index | 0.01 |
Text processing for complex domains such as terrorism is complicated by the difficulty of being able to reliably distinguish relevant and irrelevant texts. We have discovered a simple and effective filter, the Relevancy Signatures Algorithm, and demonstrated its performance in the domain of terrorist event descriptions. The Relevancy Signatures Algorithm is based on the natural language processing technique of selective concept extraction, and relies on text representations that reflect predictable patterns of linguistic context.This paper describes text classification experiments conducted in the domain of terrorism using the MUC-3 text corpus. A customized dictionary of about 6,000 words provides the lexical knowledge base needed to discriminate relevant texts, and the CIRCUS sentence analyzer generates relevancy signatures as an effortless side-effect of its normal sentence analysis. Although we suspect that the training base available to us from the MUC-3 corpus may not be large enough to provide optimal training, we were nevertheless able to attain relevancy discriminations for significant levels of recall (ranging from 11% to 47%) with 100% precision in half of our test runs.