Tipster/MUC-5: information extraction system evaluation
MUC5 '93 Proceedings of the 5th conference on Message understanding
The statistical significance of the MUC-5 results
MUC5 '93 Proceedings of the 5th conference on Message understanding
Text filtering in MUC-3 and MUC-4
MUC4 '92 Proceedings of the 4th conference on Message understanding
Statistical significance of MUC-6 results
MUC6 '95 Proceedings of the 6th conference on Message understanding
Survey of the Message Understanding Conferences
HLT '93 Proceedings of the workshop on Human Language Technology
TIPSTER '98 Proceedings of a workshop on held at Baltimore, Maryland: October 13-15, 1998
Tipster/MUC-5 information extraction system evaluation
TIPSTER '93 Proceedings of a workshop on held at Fredericksburg, Virginia: September 19-23, 1993
Identification of pleonastic it using the web
Journal of Artificial Intelligence Research
Semantic relations for problem-oriented medical records
Artificial Intelligence in Medicine
Journal of Artificial Intelligence Research
Tree kernel-based protein-protein interaction extraction from biomedical literature
Journal of Biomedical Informatics
Cost-sensitive active learning for computer-assisted translation
Pattern Recognition Letters
Hi-index | 0.00 |
The MUC-4 scores of recall, precision, and the F-measures are used to measure the performance of the participating systems. The differences in the scores between any two systems may be due to chance or may be due to a significant difference between the two systems. To rule out the possibility that the difference is due to chance, statistical hypothesis testing is used. The method of hypothesis testing used is a computationally-intensive method known as approximate randomization. The method and the statistical significance of the results for the two MUC-4 test sets, TST3 and TST4, will be discussed in this paper.