Documents similarity measurement using field association terms

Authors:
El-Sayed Atlam;M. Fuketa;K. Morita;Jun-ichi Aoe
Affiliations:
Department of information Science and Intelligent Systems, University of Tokushima, Tokushima 770-8506, Japan;Department of information Science and Intelligent Systems, University of Tokushima, Tokushima 770-8506, Japan;Department of information Science and Intelligent Systems, University of Tokushima, Tokushima 770-8506, Japan;Department of information Science and Intelligent Systems, University of Tokushima, Tokushima 770-8506, Japan
Venue:
Information Processing and Management: an International Journal
Year:
2003

Citing 15
Cited 19

An evaluation of retrieval effectiveness for a full-text document-retrieval system

Communications of the ACM
Another look at automatic text-retrieval systems

Communications of the ACM
Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
The effectiveness of a nonsyntatic approach to automatic phrase indexing for document retrieval

Journal of the American Society for Information Science
Automatic text processing: the transformation, analysis, and retrieval of information by computer

Automatic text processing: the transformation, analysis, and retrieval of information by computer
Models for retrieval with probabilistic indexing

Information Processing and Management: an International Journal - Modeling data, information and knowledge
User-specified domain knowledge for document retrieval

Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
Computer Evaluation of Indexing and Text Processing

Journal of the ACM (JACM)
A document classification method by using field association words

Information Sciences—Informatics and Computer Science: An International Journal
Similarity measurement using term negative weight and its application to word similarity

Information Processing and Management: an International Journal
Developing a new similarity measure from two different perspectives

Information Processing and Management: an International Journal
Information Retrieval

Information Retrieval
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
A new method for selecting English field association terms of compound words and its knowledge representation

Information Processing and Management: an International Journal
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning

Word classification and hierarchy using co-occurrence word information

Information Processing and Management: an International Journal
Automatic building of new field association word candidates using search engine

Information Processing and Management: an International Journal
On retrieval performance of Malay textual documents

AIA'06 Proceedings of the 24th IASTED international conference on Artificial intelligence and applications
Improvement of building field association term dictionary using passage retrieval

Information Processing and Management: an International Journal
Ranking of field association terms using Co-word analysis

Information Processing and Management: an International Journal
Evaluating the effectiveness of various similarity measures on Malay textual documents

AIC'04 Proceedings of the 4th WSEAS International Conference on Applied Informatics and Communications
Estimation of FAQ knowledge bases by using semantic expressions for questions and answers

International Journal of Computer Applications in Technology
Automatic acquisition for sensibility knowledge using co-occurrence relation

International Journal of Computer Applications in Technology
An automatic extraction method of word tendency judgement for specific subjects

International Journal of Computer Applications in Technology
Intelligent QA Systems Using Semantic Expressions

KES '09 Proceedings of the 13th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems: Part II
Building of field association terms based on links

International Journal of Computer Applications in Technology
A method of extracting malicious expressions in bulletin board systems by using context analysis

Information Processing and Management: an International Journal
New approach for field association term dictionary with passage retrieval

ACMOS'07 Proceedings of the 9th WSEAS international conference on Automatic control, modelling and simulation
Context constraint disambiguation of word semantics by field association schemes

Information Processing and Management: an International Journal
Estimation of FAQ knowledge bases by introducing measurements

KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
A new approach for improving field association term dictionary using passage retrieval

KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
A new approach for automatic building field association words using selective passage retrieval

KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
Building new field association term candidates automatically by search engine

KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
A new technique of determining speaker's intention for sentences in conversation

KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part IV

Quantified Score

Hi-index	0.01

Visualization

Abstract

Conventional approaches to text analysis and information retrieval which measured document similarity by using considering all of the information in texts are a relatively inefficiency for processing large text collections in heterogeneous subject areas. This paper outlined a new text manipulation system FA-Sim that is useful for retrieving information in large heterogeneous texts and for recognizing content similarity in text excerpts. FA-Sim is based on flexible text matching procedures carried out in various contexts and various field ranks. FA-Sim measures texts similarity by using specific field association (FA) terms instead of by comparing all text information. Similarity between texts is faster and higher by using FA-Sim than other two analysis methods. Therefore, Recall and Precision significantly improved by 39% and 37% over these two traditional methods.