LearningPinocchio: adaptive information extraction for real world applications

Authors:
F. Ciravegna;A. Lavelli
Affiliations:
Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello Street, S1 4DP Sheffield, UK e-mail: F.Ciravegna@dcs.shef.ac.uk;ITC-irst Centro per la Ricerca Scientifica e Tecnologica, via Sommarive 18, 38050 Povo (TN), Italy e-mail: lavelli@itc.it
Venue:
Natural Language Engineering
Year:
2004

Citing 9
Cited 7

Information extraction from HTML: application of a general machine learning approach

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Learning Information Extraction Rules for Semi-Structured and Free Text

Machine Learning - Special issue on natural language learning
S-CREAM - Semi-automatic CREAtion of Metadata

EKAW '02 Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management. Ontologies and the Semantic Web
User-System Cooperation in Document Annotation Based on Information Extraction

EKAW '02 Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management. Ontologies and the Semantic Web
Relational learning techniques for natural language information extraction

Relational learning techniques for natural language information extraction
Automatic acquisition of domain knowledge for Information Extraction

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Human language technologies for knowledge management: challenges and opportunities

HLTKM '01 Proceedings of the workshop on Human Language Technology and Knowledge Management - Volume 2001
Using HLT for acquiring, retrieving and publishing knowledge in AKT: position paper

HLTKM '01 Proceedings of the workshop on Human Language Technology and Knowledge Management - Volume 2001
Adaptive information extraction from text by rule induction and generalisation

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2

Mining information extraction rules from datasheets without linguistic parsing

IEA/AIE'2005 Proceedings of the 18th international conference on Innovations in Applied Artificial Intelligence
Resume information extraction with cascaded hybrid model

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Learning context-free grammars to extract relations from text

Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Personalization in e-commerce applications

The adaptive web
Ontology based information extraction from text

Knowledge-driven multimedia information extraction and ontology evolution
An approach to extract special skills to improve the performance of resume selection

DNIS'10 Proceedings of the 6th international conference on Databases in Networked Information Systems
Aggregating semantic annotators

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

The new frontier of research on Information Extraction from texts is portability without any knowledge of Natural Language Processing. The market potential is very large in principle, provided that a suitable easy-to-use and effective methodology is provided. In this paper we describe LearningPinocchio, a system for adaptive Information Extraction from texts that is having good commercial and scientific success. Real world applications have been built and evaluation licenses have been released to external companies for application development. In this paper we outline the basic algorithm behind the scenes and present a number of applications developed with LearningPinocchio. Then we report about an evaluation performed by an independent company. Finally, we discuss the general suitability of this IE technology for real world applications and draw some conclusion.