Text mining using markov chains of variable length

  • Authors:
  • Björn Hoffmeister;Thomas Zeugmann

  • Affiliations:
  • RWTH Aachen, Lehrstuhl für Informatik VI, Aachen;Division of Computer Science, Hokkaido University, Sapporo, Japan

  • Venue:
  • Proceedings of the 2005 international conference on Federation over the Web
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

When dealing with knowledge federation over text documents one has to figure out whether or not documents are related by context. A new approach is proposed to solve this problem. This leads to the design of a new search engine for literature research and related problems. The idea is that one has already some documents of interest. These documents are taken as input. Then all documents known to a classical search engine are ranked according to their relevance. For achieving this goal we use Markov chains of variable length. The algorithms developed have been implemented and testing over the Reuters-21578 data set has been performed.