Knowledge extraction from texts by SINTESI

Authors:
Fabio Ciravegna;Paolo Campia;Alberto Colognese
Affiliations:
Centro Ricerche Fiat, Orbassano (To), Italy;Centro Ricerche Fiat, Orbassano (To), Italy;Centro Ricerche Fiat, Orbassano (To), Italy
Venue:
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 4
Year:
1992

Citing 4
Cited 1

Integrating top-down and bottom-up strategies in a text processing system

ANLC '88 Proceedings of the second conference on Applied natural language processing
Recovery strategies for parsing extragrammatical language

Computational Linguistics - Special issue on ill-formed input
Meta-rules as a basis for processing ill-formed input

Computational Linguistics - Special issue on ill-formed input
Entity-oriented parsing

ACL '84 Proceedings of the 10th International Conference on Computational Linguistics and 22nd annual meeting on Association for Computational Linguistics

Information extraction

Communications of the ACM

Quantified Score

Hi-index	0.02

Visualization

Abstract

In this paper we present SINTESI, a system for the knowledge extraction from Italian inputs, currently under development in our research centre. It is used on short descriptive diagnostic texts, in order to summarise their technical content and to build a knowledge base on faults. Often in these texts complex linguistic constructions like conjunctions, negations, ellipsis and anaphorae are involved. The presence of extragrammaticalities and of implicit knowledge is also frequent, especially because of the use of a sublanguage. SINTESI extracts the diagnostic information by performing a full text analysis; it is based on a semantics driven approach integrated by a general syntactic module and it is able to cope with the complexity of the (sub)language, maintaining both accuracy and robustness. Currently the system has been tested on about 1.000 texts and by a few users; in the near future it will be used by dozens of users every day.