SRI International: description of the TACITUS system as used for MUC-3
MUC3 '91 Proceedings of the 3rd conference on Message understanding
Partial parsing via finite-state cascades
Natural Language Engineering
Using decision trees for conference resolution
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Hi-index | 0.00 |
The system that SRI used for the MUC-4 evaluation represents a significant departure from system architectures that have been employed in the past. In MUC-2 and MUC-3, SRI used the TACITUS text processing system [1], which was based on the DIALOGIC parser and grammar, and an abudctive reasoner for horn-clause logic. In MUC-4, SRI designed a new system called FASTUS (a permutation of the initial letters in Finite State Automata-based Text Understanding System) which we feel represents a significant advance in the state of the art of text processing. The system shares certain modules with the earlier TACITUS system, namely modules for text preprocessing and standardization, spelling correction, Hispanic name recognition, and the core lexicon. However, the DIALOGIC system and abductive reasoner, which were the heart and soul of the previous system, were replaced by a system whose architecture is based on cascaded finite-state automata. Using this system we were capable of achieving a significant level of performance on the MUC-4 task with less than one month devoted to domain-specific development. In addition, the system is extremely fast, and is capable of processing texts at the rate of approximately 3,200 words per minute, measured in CPU time on a Sun SPARC-2 processor. (Measured according to elapsed real time, the system about 50% slower, but the observed time depends on the particular hardware configuration involved.)