Quickly generating billion-record synthetic databases
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
ToXgene: a template-based data generator for XML
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
On the Generation of Spatiotemporal Datasets
SSD '99 Proceedings of the 6th International Symposium on Advances in Spatial Databases
The STARLIGHT information visualization system
IV '97 Proceedings of the IEEE Conference on Information Visualisation
The challenge of information visualization evaluation
Proceedings of the working conference on Advanced visual interfaces
BEST PAPER: A Knowledge Task-Based Framework for Design and Evaluation of Information Visualizations
INFOVIS '04 Proceedings of the IEEE Symposium on Information Visualization
IN-SPIRE InfoVis 2004 Contest Entry
INFOVIS '04 Proceedings of the IEEE Symposium on Information Visualization
Enabling massive scale document transformation for the semantic web: the universal parsing agent™
Proceedings of the 2005 ACM symposium on Document engineering
Metrics for evaluating human information interaction systems
Interacting with Computers
Generating synthetic syndromic-surveillance data for evaluating visual-analytics techniques
IEEE Computer Graphics and Applications - Special issue on sketching tangible interfaces augmented reality on mobile phones
Advancing user-centered evaluation of visual analytic environments through contests
Information Visualization
Foundations and frontiers in visual analytics
Information Visualization
Visual analytics technology transition progress
Information Visualization
Developing qualitative metrics for visual analytic environments
Proceedings of the 3rd BELIV'10 Workshop: BEyond time and errors: novel evaLuation methods for Information Visualization
Is your user hunting or gathering insights?: identifying insight drivers across domains
Proceedings of the 3rd BELIV'10 Workshop: BEyond time and errors: novel evaLuation methods for Information Visualization
Many roads lead to Rome: mapping users' problem-solving strategies
Information Visualization - Special issue on Evaluation for Information Visualization
A reflection on seven years of the VAST challenge
Proceedings of the 2012 BELIV Workshop: Beyond Time and Errors - Novel Evaluation Methods for Visualization
Hi-index | 0.00 |
We describe the Threat Stream Generator, a method and a toolset for creating realistic, synthetic test data for information analytics applications. Finding or creating useful test data sets is difficult for a team focused on creating solutions to information analysis problems. First, real data that might be considered good for testing analytic applications may not be available or may be classified. In the latter case, tool builders will not have the clearances needed to use, or even see, the data. Second, analysts' time is scarce and obtaining the needed characteristics of real data from them to create a test data set is difficult. Finally, generating good test data is challenging. Commercial data generators are focused on large database testing, not information analytics tool testing. Our distinctive contribution is that we embed known ground truth in a test data set, so that tool developers and others will be able to determine the effectiveness of their software and how they are progressing in their support for information analysts. Our automated methods also significantly decrease data set development time. We review our approach to scenario development, threat insertion strategies, data set development, and data set evaluation. We also discuss our recent successes in using our data in open analytic competitions.