Supervised textrank

Authors:
Fermín Cruz;José A. Troyano;Fernando Enríquez
Affiliations:
Department of Languages and Computer Systems, University of Seville, Sevilla, Spain;Department of Languages and Computer Systems, University of Seville, Sevilla, Spain;Department of Languages and Computer Systems, University of Seville, Sevilla, Spain
Venue:
FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
Year:
2006

Citing 6
Cited 0

Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
TnT: a statistical part-of-speech tagger

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Transformation-based learning in the fast lane

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
PageRank on semantic networks, with application to word sense disambiguation

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Graph-based algorithms for natural language processing and information retrieval

NAACL-Tutorials '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Tutorial Abstracts

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we investigate how to adapt the TextRank method to make it work in a supervised way. TextRank is a graph based method that applies the ideas of the ranking algorithm used in Google (PageRank) to Natural Language Processing (NLP) tasks. This approach has given very good results in many NLP tasks like text summarization, keyword extraction or word sense disambiguation. In all these tasks TextRank operates in an unsupervised way, without using any training corpus. Our main contribution is the definition of a method that allows to apply TextRank to a graph that includes information generated from a training tagged corpus. We have tested our method with the Part of Speech (POS) tagging task, comparing the results with those obtained with tools specialized in this task. The performance of our system is quite near to these tools, improving the results of two of them when the corpus tagset is big and therefore the tagging task more complicated.