Relationship extraction methods based on co-occurrence in web pages and files

  • Authors:
  • Qiang Song;Yousuke Watanabe;Haruo Yokota

  • Affiliations:
  • Tokyo Institute of Technology, Ookayama, Meguro-ku, Tokyo, Japan;Tokyo Institute of Technology, Ookayama, Meguro-ku, Tokyo, Japan;Tokyo Institute of Technology, Ookayama, Meguro-ku, Tokyo, Japan

  • Venue:
  • Proceedings of the 13th International Conference on Information Integration and Web-based Applications and Services
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

Every day, information on the Web becomes increasingly enriched. Web access is now very useful in many aspects of daily life, particularly for writing documents and programs. In fact, it has become quite usual to edit files while referring to information on the Web. During the file-editing process, we usually visit so many Web pages that we cannot remember all of the relevant ones. Later, if we want to revisit the same Web pages to modify some part of a file, it can be very hard to track down the Web pages originally referred to. In this paper, we propose methods for finding relationships between files and Web pages based on the co-occurrence of data in Web-access logs and file-access logs. These relationships are very useful for revisiting Web pages related to target files. To analyze co-occurrence in these two types of access logs, there are two approaches for merging the logs, involving a trade-off between accuracy and execution time. We call them the Pre-Merge and Post-Merge methods, and we have evaluated these two methods using actual access logs.