Invited paper: Dynamic visualization of statistical learning in the context of high-dimensional textual data

  • Authors:
  • Michael Greenacre;Trevor Hastie

  • Affiliations:
  • Universitat Pompeu Fabra, Barcelona 08005, Spain;Stanford University, Stanford, CA 94305-4065, USA

  • Venue:
  • Web Semantics: Science, Services and Agents on the World Wide Web
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Our ability to record increasingly larger and more complex sets of data is accompanied by a decline in our capacity to interpret and understand these data in the fullest sense. Multivariate analysis partially assists us in our quest by reducing the dimensionality in optimal ways, but our view is stuck in two dimensions because of the planar nature of the graphical medium, be it the printed page or the computer screen. We are developing protocols and tools to add motion to scientific graphics so that high-dimensional data can be visualized dynamically. Using the freely available R language and modern methods of statistical learning and data mining, we construct animation sequences that take the viewer on a dynamic journey through the data. The idea is illustrated using a large data set of all the abstracts of the journal Vaccine in the years 2003-2006, according to their word frequencies and citation counts.