Detection of layout-purpose TABLE tags based on machine learning

Authors:
Hidehiko Okada;Taiki Miura
Affiliations:
Kyoto Sangyo University, Kyoto, Japan;Kyoto Sangyo University, Kyoto, Japan
Venue:
UAHCI'07 Proceedings of the 4th international conference on Universal access in human-computer interaction: applications and services
Year:
2007

Citing 6
Cited 0

The use of guidelines to automatically verify Web accessibility

Universal Access in the Information Society
Comparing accessibility evaluation tools: a method for tool effectiveness

Universal Access in the Information Society
Evolution of web site design patterns

ACM Transactions on Information Systems (TOIS)
Automated Web Site Evaluation: Researchers' and Practitioners' Perspectives

Automated Web Site Evaluation: Researchers' and Practitioners' Perspectives
Automated web evaluation by guideline review

Journal of Web Engineering
Flexible reporting for automated usability and accessibility evaluation of web sites

INTERACT'05 Proceedings of the 2005 IFIP TC13 international conference on Human-Computer Interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

To make webpages more accessible to people with disabilities, 〈table〉 tags should not be used as a means to layout document content. Therefore, to evaluate the accessibility of webpages, it should be checked whether the pages include layout-purpose 〈table〉 tags. Automated precise detection of layout-purpose 〈table〉 tags in HTML sources is still a research challenge because it requires further than simply checking whether specific tags and/or attributes of the tags are included in the sources. We propose a method for the detection that is based on machine learning. The proposed method derives a 〈table〉 tag classifier that deduces the purpose of a 〈table〉 tag: the classifier deduces whether a 〈table〉 tag is a layout-purpose one or a table-purpose one. We have developed a system that derives classification rules by ID3. The system derives a decision tree from a set of learning data (〈table〉 tags of which the purposes are known) and classifies 〈table〉 tags in webpages under evaluation. Classification accuracy was evaluated by cross validation with 200 test data collected from the Web. Result of the evaluation revealed that 1) the tags can be roughly classified with attribute values of border, number of rows, number of tags that appear ahead of the 〈table〉 tag, and the nest of 〈table〉 tags (i.e., these attributes are more likely to appear in upper layers in decision trees), and 2) the accuracy rates are about 90% for the 200 test data.