Selective integration of background knowledge in TCBR systems

  • Authors:
  • Anil Patelia;Sutanu Chakraborti;Nirmalie Wiratunga

  • Affiliations:
  • Department of Computer Science and Engineering, Indian Institute of Technology Madras, Chennai, India;Department of Computer Science and Engineering, Indian Institute of Technology Madras, Chennai, India;School of Computing, The Robert Gordon University, Aberdeen, Scotland, UK

  • Venue:
  • ICCBR'11 Proceedings of the 19th international conference on Case-Based Reasoning Research and Development
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper explores how background knowledge from freely available web resources can be utilised for Textual Case Based Reasoning. The work reported here extends the existing Explicit Semantic Analysis approach to representation, where textual content is represented using concepts with correspondence to Wikipedia articles. We present approaches to identify Wikipedia pages that are likely to contribute to the effectiveness of text classification tasks. We also study the effect of modelling semantic similarity between concepts (amounting to Wikipedia articles) empirically. We conclude with the observation that integrating background knowledge from resources like Wikipedia into TCBR tasks holds a lot of promise as it can improve system effectiveness even without elaborate manual knowledge engineering. Significant performance gains are obtained using a very small number of features that have very strong correspondence to how humans describe the domain.