Information extraction with automatic knowledge expansion

Authors:
Hanmin Jung;Eunji Yi;Dongseok Kim;Gary Geunbae Lee
Affiliations:
Department of Computer Science and Engineering, Pohang University of Science and Technology, San 31, Hyoja-dong, Nam-gu, Pohang, Kyungbuk 790-784, South Korea;Department of Computer Science and Engineering, Pohang University of Science and Technology, San 31, Hyoja-dong, Nam-gu, Pohang, Kyungbuk 790-784, South Korea;Department of Computer Science and Engineering, Pohang University of Science and Technology, San 31, Hyoja-dong, Nam-gu, Pohang, Kyungbuk 790-784, South Korea;Department of Computer Science and Engineering, Pohang University of Science and Technology, San 31, Hyoja-dong, Nam-gu, Pohang, Kyungbuk 790-784, South Korea
Venue:
Information Processing and Management: an International Journal
Year:
2005

Citing 20
Cited 3

Machine learning an artificial intelligence approach volume II

Machine learning an artificial intelligence approach volume II
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Information extraction from HTML: application of a general machine learning approach

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Learning Information Extraction Rules for Semi-Structured and Free Text

Machine Learning - Special issue on natural language learning
Relational learning of pattern-match rules for information extraction

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Learning dictionaries for information extraction by multi-level bootstrapping

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Wrapper induction: efficiency and expressiveness

Artificial Intelligence - Special issue on Intelligent internet systems
Understanding SGML and XML Tools: Practical Programs for Handling Structured Text

Understanding SGML and XML Tools: Practical Programs for Handling Structured Text
Integrated multi-strategic Web document pre-processing for sentence and word boundary detection

Information Processing and Management: an International Journal
Learning Logical Definitions from Relations

Machine Learning
The CN2 Induction Algorithm

Machine Learning
Information Extraction: Techniques and Challenges

SCIE '97 International Summer School on Information Extraction: A Multidisciplinary Approach to an Emerging Information Technology
Extracting Patterns and Relations from the World Wide Web

WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Unsupervised learning of mDTD extraction patterns for web text mining

Information Processing and Management: an International Journal
Towards a workbench for acquisition of domain knowledge from natural language

EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
Toward general-purpose learning for information extraction

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Text mining with information extraction

Text mining with information extraction
Automatic pattern acquisition for Japanese information extraction

HLT '01 Proceedings of the first international conference on Human language technology research
Transforming examples into patterns for information extraction

TIPSTER '98 Proceedings of a workshop on held at Baltimore, Maryland: October 13-15, 1998
Automatically generating extraction patterns from untagged text

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2

AASA: a Method of Automatically Acquiring Semantic Annotations

Journal of Information Science
Information extraction for user's utterance processing on ubiquitous robot companion

NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
A term normalization method for efficient knowledge acquisition through text processing

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

POSIE (POSTECH Information Extraction System) is an information extraction system which uses multiple learning strategies, i.e., SmL, user-oriented learning, and separate-context learning, in a question answering framework. POSIE replaces laborious annotation with automatic instance extraction by the SmL from structured Web documents, and places the user at the end of the user-oriented learning cycle. Information extraction as question answering simplifies the extraction procedures for a set of slots. We introduce the techniques verified on the question answering framework, such as domain knowledge and instance rules, into an information extraction problem. To incrementally improve extraction performance, a sequence of the user-oriented learning and the separate-context learning produces context rules and generalizes them in both the learning and extraction phases. Experiments on the "continuing education" domain initially show that the F1-measure becomes 0.477 and recall 0.748 with no user training. However, as the size of the training documents grows, the F1-measure reaches beyond 0.75 with recall 0.772. We also obtain F-measure of about 0.9 for five out of seven slots on "job offering" domain.