Similarity of documents based on the vector sequence model

  • Authors:
  • Akihiro Yamamoto;Akira Ogiso

  • Affiliations:
  • Graduate School of Informatics, Kyoto University, Kyoto, Japan;Mitsubishi Motors Corporation

  • Venue:
  • IHI'04 Proceedings of the 2004 international conference on Intuitive Human Interfaces for Organizing and Accessing Intellectual Assets
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we propose a new method for searching natural language documents. The method is based on the vector-sequence model where every document is transformed into a sequence of document vectors. The model is intended to clarify the dynamism of the usage of keywords in every document. In order to find similar documents in the model, we formalize the Length-Based Refinement (LBR, for short) of sequence of documents. The document management system based on LBRs requires users to give a query in the form of a document, but would support them to search documents in a quite different way of the keyword-based search. By developing the system we try to show that the search mechanism based on LBRs could be regarded as a type of intuitive access to documents.