A new approach for improving field association term dictionary using passage retrieval

Authors:
Kazuhiro Morita;El-Sayed Atlam;Elmarhomy Ghada;Masao Fuketa;Jun-ichi Aoe
Affiliations:
Department of Information Science and Intelligent Systems, University of Tokushima, Tokushima, Japan;Department of Information Science and Intelligent Systems, University of Tokushima, Tokushima, Japan;Department of Information Science and Intelligent Systems, University of Tokushima, Tokushima, Japan;Department of Information Science and Intelligent Systems, University of Tokushima, Tokushima, Japan;Department of Information Science and Intelligent Systems, University of Tokushima, Tokushima, Japan
Venue:
KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
Year:
2006

Citing 14
Cited 0

Automatic text processing: the transformation, analysis, and retrieval of information by computer

Automatic text processing: the transformation, analysis, and retrieval of information by computer
Models for retrieval with probabilistic indexing

Information Processing and Management: an International Journal - Modeling data, information and knowledge
Approaches to passage retrieval in full text information systems

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Subtopic structuring for full-length document access

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Passage-level evidence in document retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic text decomposition and structuring

Information Processing and Management: an International Journal
Passage retrieval revisited

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Passage retrieval: a probabilistic technique

Information Processing and Management: an International Journal
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
A new method for selecting English field association terms of compound words and its knowledge representation

Information Processing and Management: an International Journal
TextTiling: A Quantitative Approach to Discourse

TextTiling: A Quantitative Approach to Discourse
Documents similarity measurement using field association terms

Information Processing and Management: an International Journal
An automatic clustering of articles using dictionary definitions

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Automatic building of new field association word candidates using search engine

Information Processing and Management: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Large collections of full-text document are now commonly used in automated information retrieval Readers generally identify the subject of a text when they notice specific terms, calledField Association (FA) terms, in that text. Previous researches showed that evidence from passage can improve retrieval results by dividing documents into coherent units with each unit corresponding to a subtopic. Moreover, many current researchers are extracting FA terms candidates from the whole documents to build FA term dictionary automatically. This paper proposes a method for automatically building new FA term dictionary from documents after using passage retrieval. A WWW search engine is used to extract FA terms candidates from passage document corpora. Then, new FA terms candidates in each field are automatically compared with previously determined FA terms dictionary. Finally, new FA terms from extracted term candidates are appended automatically to the existence FA terms dictionary. From experimental results the new technique using passage documents can automatically append about 15% of FA terms from terms candidates to the existence FA term dictionary over the old method. Moreover, Recall and Precision significantly improved by 20% and 32% over the traditional method. The proposed methods are applied to 38,372 articles from the large tagged corpus.