Basic techniques in text mining using open-source tools

  • Authors:
  • Jun Iio

  • Affiliations:
  • Chuo University, Higashinakano, Hachioji-shi, Tokyo, Japan

  • Venue:
  • Proceedings of the 9th International Symposium on Open Collaboration
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

There are many text mining tools provided commercially and non-commercially. However, the elementary text-based analysis can be done with basic Unix commands, shell-scripts, and small program of scripting languages, instead of using such extensive software. This paper introduces the basic techniques for text mining, using combination of a set of standard commands, small code, and generic tools provided as the open-source software. The target of the analysis are sixty-seven articles written by one author in a relay column since 1998. Several text-based analyses reveals a trend of interest moved within about fifteen years. In addition, at the end of this paper, the results of text-based analysis are compared with that of non-text-based analysis and the efficiency of non-parametric analysis is discussed.