Seeing the whole in parts: text summarization for web browsing on handheld devices
Proceedings of the 10th international conference on World Wide Web
DOM-based content extraction of HTML documents
WWW '03 Proceedings of the 12th international conference on World Wide Web
Automating Content Extraction of HTML Documents
World Wide Web
Personalizable edge services for web accessibility
W4A '06 Proceedings of the 2006 international cross-disciplinary workshop on Web accessibility (W4A): Building the mobile web: rediscovering accessibility?
A Semantic-web based framework for developing applications to improve accessibility in the WWW
W4A '06 Proceedings of the 2006 international cross-disciplinary workshop on Web accessibility (W4A): Building the mobile web: rediscovering accessibility?
SADIe: Structural semantics for accessibility and device independence
ACM Transactions on Computer-Human Interaction (TOCHI)
A Personal Web Information/Knowledge Retrieval System
Proceedings of the 2008 conference on Information Modelling and Knowledge Bases XIX
A New Partial Information Extraction Method for Personal Mashup Construction
Proceedings of the 2010 conference on Information Modelling and Knowledge Bases XXI
Web mediators for accessible browsing
ERCIM'06 Proceedings of the 9th conference on User interfaces for all
Identifying Behavioral Strategies of Visually Impaired Users to Improve Access to Web Content
ACM Transactions on Accessible Computing (TACCESS)
Friend Lens: novel web content sharing through strategic manipulation of cached html
International Journal of Web Based Communities
Optimizing the user environment: leading towards an accessible and usable experience
Accessible Design'05 Proceedings of the 2005 international conference on Accessible Design in the Digital World
Improving web accessibility for dichromat users through contrast preservation
ICCHP'12 Proceedings of the 13th international conference on Computers Helping People with Special Needs - Volume Part I
Hi-index | 0.00 |
Web pages often contain clutter (such as ads, unnecessary animations and extraneous links) around the body of an article, which distracts a user from actual content. This can be especially inconvenient for blind and visually impaired users. The W3C's Web Accessibility Initiative (WAI) has defined a set of guidelines to make web pages more compatible with tools built specifically for persons with disabilities. While this initiative has put forth an excellent set of principles, unfortunately many websites continue to be inaccessible as well as cluttered. In order to address the clutter problem, we have developed a framework that employs a host of heuristics in the form of tunable filters for the purpose of content extraction. Our hypothesis is that automatically filtering out selected elements from websites will leave the base content that users are interested in and, as a side-effect, render them more accessible. Although our heuristics are intuition-based, rather than derived from the W3C accessibility guidelines, we imagined however that they would have little impact on web pages that are fully compliant with the accessibility guidelines. We were wrong: some (technically) accessible web pages still include significant clutter. This paper discusses our content extraction framework and its application to accessible web pages.