Paraphrase identification using machine learning techniques

Authors:
A. Chitra;C. S. Saravana Kumar
Affiliations:
Department of Computer Science, PSG College of Technology, Coimbatore, India;Department of Computer Science, PSG College of Technology, Coimbatore, India
Venue:
ICNVS'10 Proceedings of the 12th international conference on Networking, VLSI and signal processing
Year:
2010

Citing 11
Cited 0

An Information-Theoretic Definition of Similarity

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
SVMTorch: support vector machines for large-scale regression problems

The Journal of Machine Learning Research
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Extracting paraphrases from a parallel corpus

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Learning to paraphrase: an unsupervised approach using multiple-sequence alignment

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Assessing system agreement and instance difficulty in the lexical sample tasks of SENSEVAL-2

WSD '02 Proceedings of the ACL-02 workshop on Word sense disambiguation: recent successes and future directions - Volume 8
Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Measuring the semantic similarity of texts

EMSEE '05 Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment
Using measures of semantic relatedness for word sense disambiguation

CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
The role and resolution of textual entailment in natural language processing applications

NLDB'06 Proceedings of the 11th international conference on Applications of Natural Language to Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Paraphrases are different ways of expressing the same content. Two sentences are said to be paraphrases if they are semantically equivalent. Identification of paraphrases has numerous applications such as Information Extraction, Question Answering, etc. The traditional systems use threshold values to decide whether two sentences are paraphrases. This threshold determination process is independent on the training data and apart may lead to incorrect paraphrase reasoning. In order to avoid the threshold settings, we propose to use machine learning techniques. The advantages of a ML approach is its ability to account for a large mass of information and the possibility to incorporate different information sources like morphologic, syntactic, and semantic among others in a single execution. With the objective to increase the performance of the system and to develop a machine learning approach for paraphrase identification, we scrutinize the influence of the combination of lexical and semantic information, as well as techniques for classifier combination.