Unsupervised extraction of text segments from heterogeneous document collections

  • Authors:
  • Hong Cui

  • Affiliations:
  • University of Arizona, Tucson, AZ

  • Venue:
  • Proceedings of the 73rd ASIS&T Annual Meeting on Navigating Streams in an Information Ecosystem - Volume 47
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a simple, unsupervised bootstrapping procedure that identifies morphological description segments from heterogeneous biodiversity document collections. While the procedure is used to preprocess biodiversity literature for semantic annotation of morphological descriptions in our project, it also can be used to crawl the Web for morphological descriptions for a biodiversity niche search engine.