Paraphrase identification using machine learning techniques

  • Authors:
  • A. Chitra;C. S. Saravana Kumar

  • Affiliations:
  • Department of Computer Science, PSG College of Technology, Coimbatore, India;Department of Computer Science, PSG College of Technology, Coimbatore, India

  • Venue:
  • ICNVS'10 Proceedings of the 12th international conference on Networking, VLSI and signal processing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Paraphrases are different ways of expressing the same content. Two sentences are said to be paraphrases if they are semantically equivalent. Identification of paraphrases has numerous applications such as Information Extraction, Question Answering, etc. The traditional systems use threshold values to decide whether two sentences are paraphrases. This threshold determination process is independent on the training data and apart may lead to incorrect paraphrase reasoning. In order to avoid the threshold settings, we propose to use machine learning techniques. The advantages of a ML approach is its ability to account for a large mass of information and the possibility to incorporate different information sources like morphologic, syntactic, and semantic among others in a single execution. With the objective to increase the performance of the system and to develop a machine learning approach for paraphrase identification, we scrutinize the influence of the combination of lexical and semantic information, as well as techniques for classifier combination.