Hardware support for language aware information mining

  • Authors:
  • Michael Freeman;Thimal Jayasooriya

  • Affiliations:
  • Department of Computer Science, University of York, UK;Department of Computer Science, University of York, UK

  • Venue:
  • KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part III
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Information retrieval from text or ‘text mining' is the process of extracting interesting and non-trivial knowledge from unstructured text. With the ever increasing amounts of information stored on the web or archived within a computing system, high performance data processing architectures are required to process this data in real time. The aim of the work presented in this paper is the development of a hardware text mining IP-Core for use in FPGA based systems. In this paper we will describe the pre-processing engine we have developed for the PRESENCE II PCI card, to accelerate the identification of significant words within a document, logging their frequency and position. The performance of this system is then compared to an equivalent software implementation using the Lucene software package.