Identifying commented passages of documents using implicit hyperlinks

Authors:
Jean-Yves Delort
Affiliations:
University of Montpellier 2, Montpellier, France
Venue:
Proceedings of the seventeenth conference on Hypertext and hypermedia
Year:
2006

Citing 16
Cited 6

Toward an ecology of hypertext annotation

Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems
Machine learning of generic and user-focused summarization

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Summarizing text documents: sentence selection and evaluation metrics

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Automatically summarising Web sites: is there a way around it?

Proceedings of the ninth international conference on Information and knowledge management
Linking in context

Proceedings of the 12th ACM conference on Hypertext and Hypermedia
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Mining the peanut gallery: opinion extraction and semantic classification of product reviews

WWW '03 Proceedings of the 12th international conference on World Wide Web
Enhanced web document summarization using hyperlinks

Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
A network-based approach to text handling for the on-line scientific community

A network-based approach to text handling for the on-line scientific community
An automatic extraction of key paragraphs based on context dependency

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Mining anchor text for query refinement

Proceedings of the 13th international conference on World Wide Web
Abstract generation based on rhetorical structure extraction

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
A decision-based approach to rhetorical parsing

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Web-page summarization using clickthrough data

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Determining the semantic orientation of terms through gloss classification

Proceedings of the 14th ACM international conference on Information and knowledge management
Thumbs up?: sentiment classification using machine learning techniques

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10

Comments-oriented blog summarization by sentence extraction

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Comments-oriented document summarization: understanding documents with readers' feedback

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Measuring the descriptiveness of web comments

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Incremental Personalised Summarisation with Novelty Detection

FQAS '09 Proceedings of the 8th International Conference on Flexible Query Answering Systems
Lexicon-based Comments-oriented News Sentiment Analyzer system

Expert Systems with Applications: An International Journal
Information Retrieval in the Commentsphere

ACM Transactions on Intelligent Systems and Technology (TIST)

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper addresses the issue of automatically selecting passages of blog posts using readers' comments. The problem is difficult because: (i) the textual content of blogs is often noisy, (ii) comments do not always target passages of the posts and, (iii) comments are not equally useful for identifying important passages. We have developed a system for selecting commented passages which takes as input blog posts and their comments and delivers, for each post, the sentences of the post which are the most commented and/or the most discussed. Our approach combines three steps to identify commented passages of a post. The first step is to remove the complexity of processing the contents of posts and comments using heuristics adapted to the language of the blog. The second step is to find useful comments and assigns them a degree of relevance using a model automatically built and validated by an expert. The third step is to identify important passages using relevant comments. We conducted two experiments to evaluate the usefulness and the effectiveness of our approach. The first study show that in only 50% of the posts, the most commented sentence elicited by our approach corresponds to the post extract generated using generic summarization. In the second study, human participants confirmed that, in practice, selected passages are frequently commented passages.