Sharing, finding and reusing end-user code for reformatting and validating data

  • Authors:
  • Christopher Scaffidi

  • Affiliations:
  • School of Electrical Engineering and Computer Science, Oregon State University, 1148 Kelley Engineering Center, Oregon State University, Corvallis, OR 97331-4501, USA

  • Venue:
  • Journal of Visual Languages and Computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

To help users with automatically reformatting and validating spreadsheets and other datasets, prior work introduced a user-extensible data model called ''topes'' and a supporting visual programming language. However, no support has existed to date for users to exchange and reuse topes. This functional gap results in wasteful duplication of work as users implement topes that other people have already created. In this paper, a design for a new repository system is presented that supports sharing and finding of topes for reuse. This repository tightly integrates traditional keyword-based search with two additional search methods whose usefulness in repositories of end-user code has gone unexplored to date. The first method is ''search-by-match'', where a user specifies examples of data, and the repository retrieves topes that can reformat and validate that data. The second method is collaborative filtering, which has played a vital role in repositories of non-code artifacts. The repository's search functionality was empirically tested on a prototype repository implementation by simulating queries generated from real user spreadsheets. This experiment reveals that search-by-match and collaborative filtering greatly improve the accuracy of search over the traditional keyword-based approach, to a recall as high as 95%. These results show that search-by-match and collaborative filtering are useful approaches for helping users to publish, find, and reuse visual programs similar to topes.