Overview and semantic issues of text mining

  • Authors:
  • Anna Stavrianou;Periklis Andritsos;Nicolas Nicoloyannis

  • Affiliations:
  • Université Lumière Lyon2, France;University of Trento, Italy;Université Lumière Lyon2, France

  • Venue:
  • ACM SIGMOD Record
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Text mining refers to the discovery of previously unknown knowledge that can be found in text collections. In recent years, the text mining field has received great attention due to the abundance of textual data. A researcher in this area is requested to cope with issues originating from the natural language particularities. This survey discusses such semantic issues along with the approaches and methodologies proposed in the existing literature. It covers syntactic matters, tokenization concerns and it focuses on the different text representation techniques, categorisation tasks and similarity measures suggested.