Information clustering based on fuzzy multisets

  • Authors:
  • Sadaaki Miyamoto

  • Affiliations:
  • Institute of Engineering Mechanics and Systems, University of Tsukuba, Ibaraki 305-8573, Japan

  • Venue:
  • Information Processing and Management: an International Journal - Modelling vagueness and subjectivity in information access
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

A fuzzy multiset model for information clustering is proposed with application to information retrieval on the World Wide Web. Noting that a search engine retrieves multiple occurrences of the same subjects with possibly different degrees of relevance, we observe that fuzzy multisets provide an appropriate model of information retrieval on the WWW. Information clustering which means both term clustering and document clustering is considered. Three methods of the hard c-means, fuzzy c-means, and an agglomerative method using cluster centers are proposed. Two distances between fuzzy multisets and algorithms for calculating cluster centers are defined. Theoretical properties concerning the clustering algorithms are studied. Illustrative examples are given to show how the algorithms work.