Feature words that classify problem sentence in scientific article

  • Authors:
  • Toshihiko Sakai;Sachio Hirokawa

  • Affiliations:
  • Kyushu University, Fukuoka, Japan;Kyushu University, Fukuoka, Japan

  • Venue:
  • Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Literature review requires understanding the contents from several view points, such as the problem and the method that the articles describe. Search from these viewpoints will improve the efficiency of survey, if particular segments of articles were extracted, indexed and can be used as auxiliary query. This paper focuses on sentences that describe the problem in an abstract and the feature sets that classify such problem sentences. Classification performance are evaluated by 10-fold cross-validation for six candidate sets of feature words. It turned out that the set of all words gains the best performance if 90% of the data are used as training data. However, the set of a small number of words with positive scores outperforms other feature sets, if the training data is only 10%. In such a realistic situation, the feature words are effective in improving classification performance.