Relationship extraction methods based on co-occurrence in web pages and files

Authors:
Qiang Song;Yousuke Watanabe;Haruo Yokota
Affiliations:
Tokyo Institute of Technology, Ookayama, Meguro-ku, Tokyo, Japan;Tokyo Institute of Technology, Ookayama, Meguro-ku, Tokyo, Japan;Tokyo Institute of Technology, Ookayama, Meguro-ku, Tokyo, Japan
Venue:
Proceedings of the 13th International Conference on Information Integration and Web-based Applications and Services
Year:
2011

Citing 7
Cited 1

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Modern Information Retrieval

Modern Information Retrieval
Connections: using context to enhance file search

Proceedings of the twentieth ACM symposium on Operating systems principles
OreDesk: A Tool for Retrieving Data History Based on User Operations

ISM '06 Proceedings of the Eighth IEEE International Symposium on Multimedia
A Method for Searching Keyword-Lacking Files Based on Interfile Relationships

OTM '08 Proceedings of the OTM Confederated International Workshops and Posters on On the Move to Meaningful Internet Systems: 2008 Workshops: ADI, AWeSoMe, COMBEK, EI2N, IWSSA, MONET, OnToContent + QSI, ORM, PerSys, RDDS, SEMELS, and SWWS
A file search method based on intertask relationships derived from access frequency and RMC operations on files

DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I
Activity based metadata for semantic desktop search

ESWC'05 Proceedings of the Second European conference on The Semantic Web: research and Applications

Extraction of relationship between web pages and files in access logs

International Journal of Business Intelligence and Data Mining

Quantified Score

Hi-index	0.01

Visualization

Abstract

Every day, information on the Web becomes increasingly enriched. Web access is now very useful in many aspects of daily life, particularly for writing documents and programs. In fact, it has become quite usual to edit files while referring to information on the Web. During the file-editing process, we usually visit so many Web pages that we cannot remember all of the relevant ones. Later, if we want to revisit the same Web pages to modify some part of a file, it can be very hard to track down the Web pages originally referred to. In this paper, we propose methods for finding relationships between files and Web pages based on the co-occurrence of data in Web-access logs and file-access logs. These relationships are very useful for revisiting Web pages related to target files. To analyze co-occurrence in these two types of access logs, there are two approaches for merging the logs, involving a trade-off between accuracy and execution time. We call them the Pre-Merge and Post-Merge methods, and we have evaluated these two methods using actual access logs.