Tightly coupling visual and linguistic features for enriching audio-based web browsing experience

Authors:
Muhammad Asiful Islam;Faisal Ahmed;Yevgen Borodin;I. V. Ramakrishnan
Affiliations:
Stony Brook University, Stony Brook, NY, USA;Stony Brook University, Stony Brook, NY, USA;Stony Brook University, Stony Brook, NY, USA;Stony Brook University, Stony Brook, NY, USA
Venue:
Proceedings of the 20th ACM international conference on Information and knowledge management
Year:
2011

Citing 6
Cited 1

Relationship-based clustering and cluster ensembles for high-dimensional data mining

Relationship-based clustering and cluster ensembles for high-dimensional data mining
Csurf: a context-driven non-visual web-browser

Proceedings of the 16th international conference on World Wide Web
A General Approach for Partitioning Web Page Content Based on Geometric and Style Information

ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
A graph-theoretic approach to webpage segmentation

Proceedings of the 17th international conference on World Wide Web
Introduction to Information Retrieval

Introduction to Information Retrieval
Hearsay: a new generation context-driven multi-modal assistive web browser

Proceedings of the 19th international conference on World wide web

Thematic organization of web content for distraction-free text-to-speech narration

Proceedings of the 14th international ACM SIGACCESS conference on Computers and accessibility

Quantified Score

Hi-index	0.01

Visualization

Abstract

People who are blind use screen readers for browsing web pages. Since screen readers read out content serially, a naive readout tends to mix irrelevant and relevant content thereby disrupting the coherency of the material being read out and confusing the listener. To address this problem we can partition web pages into coherent segments and narrate each such piece separately. Extant methods to do segmentation use visual and structural cues without taking the semantics into account and consequently create segments containing irrelevant material. In this paper, we describe a new technique for creating coherent segments by tightly coupling visual, structural, and linguistic features present in the content. A notable aspect of the technique is that it produces segments with little irrelevant content. Preliminary experiments indicate that the technique is effective in creating highly coherent segments and the experiences of an early adopter who is blind suggest that it enriches the overall browsing experience.