A Framework for Effective Annotation of Information from Closed Captions Using Ontologies

  • Authors:
  • Latifur Khan;Dennis McLeod;Eduard Hovy

  • Affiliations:
  • Department of Computer Science, University of Texas at Dallas, Richardson 75083-0688;Department of Computer Science, University of Southern California, Los Angeles 90088;Information Sciences Institute, University of Southern California, Marina del Rey 90292

  • Venue:
  • Journal of Intelligent Information Systems
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

To improve the accuracy in terms of precision and recall of an audio information retrieval system we have created a domain-specific ontology (a collection of key concepts and their interrelationships), as well as a novel, pruning algorithm. Given the shortcomings of keyword-based techniques, we have opted to employ a concept-based technique utilizing this ontology. Achieving high precision and high recall is the key problem in the retrieval of audio information. In traditional approaches, high recall is typically achieved at the expense of low precision, and vice versa. Through the use of a domain-specific ontology appropriate concepts can be identified during metadata generation (description of audio) or query generation, thus improving precision.When irrelevant concepts are associated with queries or documents there is a loss of precision. On the other side of the coin, if relevant concepts are discarded, a loss of recall will ensue. In conjunction with the use of a domain specific ontology we have thus proposed a novel, automatic pruning algorithm which prunes as many irrelevant concepts as possible during any case of description and identification of documents, and query generation. To improve recall, A controlled and correct query expansion mechanism is proposed for the improvement of recall, thus guaranteeing that precision will not be lost.We have constructed a demonstration prototype, and experimentally and analytically we have shown that our model, compared to keyword search, achieves a significantly higher degree of precision and recall.