Curious machines: active learning with structured instances

  • Authors:
  • Mark Craven;Burr Settles

  • Affiliations:
  • The University of Wisconsin - Madison;The University of Wisconsin - Madison

  • Venue:
  • Curious machines: active learning with structured instances
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Supervised machine learning is a branch of artificial intelligence concerned with automatically inducing predictive models from labeled data. Such learning approaches are useful for many interesting real-world applications, but particularly shine for tasks involving the automatic organization, extraction, and retrieval of information from large collections of data (e.g., text, images, and other digital media). In traditional supervised learning, one uses "labeled" training data to induce a model. However, labeled instances for real-world applications are often difficult, expensive, or time consuming to obtain. Consider a complex task such as extracting key person and organization names from text documents. While gathering large amounts of unlabeled documents for these tasks is often relatively easy (e.g., from the World Wide Web), labeling these texts usually requires experienced human annotators with specific domain knowledge and training. There are implicit costs associated with obtaining these labels from domain experts, such as limited time and financial resources. This is especially true for applications that involve learning from instances with complex structures, which can require labels at varying levels of granularity. Active learning addresses this inherent bottleneck by allowing the learner to selectively choose which parts of the available data are labeled for training. The goal is to maximize the accuracy of the learner through such "queries," while minimizing the work required of human annotators. In this thesis, I explore several important questions regarding active learning for these and similar tasks involving structured instances. What query strategies are available for these learning algorithms, and how do they compare? How might a learner pose queries at different levels of granularity, as with multiple-instance learning? Are there relationships between certain properties of a query and its difficulty for the annotator? If so, can these relationships be learned and exploited during active learning? The answers to the questions illustrate the utility and promise of active learning algorithms in complex real-world learning systems.