Text mining agent for net auction

  • Authors:
  • Yukitaka Kusumura;Yoshinori Hijikata;Shogo Nishida

  • Affiliations:
  • Graduate School of Engineering Science Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka, Japan;Graduate School of Engineering Science Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka, Japan;Graduate School of Engineering Science Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka, Japan

  • Venue:
  • Proceedings of the 2004 ACM symposium on Applied computing
  • Year:
  • 2004

Quantified Score

Hi-index 0.01

Visualization

Abstract

Net auctions have been widely utilized with the recent development of the Internet. However, it is a problem that there are too many items for bidders to select the most suitable one. We aim at supporting the bidders on net auctions by automatically generating a table which contains the features of several items for comparison. We construct a system called NTM-Agent(Net auction Text Mining Agent). The system collects Web pages of items and extracts the items' features from the pages. After that, it generates a table which contains the extracted features. This research focuses on two problems in the process. The first problem is that if the system collects items automatically, the results contain the items which is different from the items of the user's target. The second problem is that the descriptions in net auctions are not uniform (There are different formats such as sentences, items and tables. The subjects of some sentences are omitted.). Therefore, it is difficult to extract the information from the descriptions by conventional methods of information extraction. This research proposes methods to solve the problems. For the first problem, NTM-Agent filters the items by correlation rules about the keywords in the titles and the item descriptions. These rules are created semi-automatically by a support tool. For the second problem, NTM-Agent extracts the information by distinguishing the formats. It also learns the feature values from plain examples for the future extraction.