A visual medium for programmatic control of interactive applications
Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Programming by example: visual generalization in programming by example
Communications of the ACM
Recursive X-Y cut using bounding boxes of connected components
ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 2) - Volume 2
Pixel data access: interprocess communication in the user interface for end-user programming and graphical macros
WinCuts: manipulating arbitrary window regions for more effective use of screen space
CHI '04 Extended Abstracts on Human Factors in Computing Systems
ScreenCrayons: annotating anything
Proceedings of the 17th annual ACM symposium on User interface software and technology
User interface façades: towards fully adaptable user interfaces
UIST '06 Proceedings of the 19th annual ACM symposium on User interface software and technology
Recognition of Screen-Rendered Text
ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 02
Segmentation of Very Low Resolution Screen-Rendered Text
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
Sikuli: using GUI screenshots for search and automation
Proceedings of the 22nd annual ACM symposium on User interface software and technology
Automatically identifying targets users interact with during real world tasks
Proceedings of the 15th international conference on Intelligent user interfaces
Prefab: implementing advanced behaviors using pixel-based reverse engineering of interface structure
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
GUI testing using computer vision
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
A multiple classifier approach for the recognition of screen-rendered text
CAIP'07 Proceedings of the 12th international conference on Computer analysis of images and patterns
Content and hierarchy in pixel-based methods for reverse engineering interface structure
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Deep shot: a framework for migrating tasks across devices using mobile phone cameras
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Using graphical representation of user interfaces as visual references
Proceedings of the 24th annual ACM symposium adjunct on User interface software and technology
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Waken: reverse engineering usage information and interface structure from software videos
Proceedings of the 25th annual ACM symposium on User interface software and technology
Tongible: a non-contact tongue-based interaction technique
Proceedings of the 14th international ACM SIGACCESS conference on Computers and accessibility
Patina: dynamic heatmaps for visualizing application usage
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
WidgetLens: a system for adaptive content magnification of widgets
BCS-HCI '13 Proceedings of the 27th International BCS Human Computer Interaction Conference
Hi-index | 0.00 |
Pixel-based methods are emerging as a new and promising way to develop new interaction techniques on top of existing user interfaces. However, in order to maintain platform independence, other available low-level information about GUI widgets, such as accessibility metadata, was neglected intentionally. In this paper, we present a hybrid framework, PAX, which associates the visual representation of user interfaces (i.e. the pixels) and their internal hierarchical metadata (i.e. the content, role, and value). We identify challenges to building such a framework. We also develop and evaluate two new algorithms for detecting text at arbitrary places on the screen, and for segmenting a text image into individual word blobs. Finally, we validate our framework in implementations of three applications. We enhance an existing pixel-based system, Sikuli Script, and preserve the readability of its script code at the same time. Further, we create two novel applications, Screen Search and Screen Copy, to demonstrate how PAX can be applied to development of desktop-level interactive systems.