Automatic Analysis of Large Text Corpora - A Contribution to Structuring WEB Communities

  • Authors:
  • Gerhard Heyer;Uwe Quasthoff;Christian Wolff

  • Affiliations:
  • -;-;-

  • Venue:
  • IICS '02 Proceedings of the Second International Workshop on Innovative Internet Computing Systems
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a corpus linguistic analysis of large text corpora based on collocations with the aim of extracting semantic relations from unstructured text. We regard this approach as a viable method for generating and structuring information about WEB communities. Starting from a short description of our corpora as well as our language analysis tools, we discuss in depth the automatic generation of collocation sets. We further give examples of different types of relations that may be found in collocation sets for arbitrary terms. We conclude with a brief discussion of applying our approach to the analysis of a sample community.