On-line preferential nearest neighbor browsing in large attributed graphs

  • Authors:
  • Jiefeng Cheng;Jeffrey Xu Yu;Reynold C. K. Cheng

  • Affiliations:
  • University of Hong Kong, China;The Chinese University of Hong Kong, China;University of Hong Kong, China

  • Venue:
  • DASFAA'10 Proceedings of the 15th international conference on Database systems for advanced applications
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Given a large weighted directed graph where nodes are associated with attributes and edges are weighted, we study a new problem, called preferential nearest neighbors (NN) browsing, in this paper. In such browsing, a user may provide one or more source nodes and some keywords to retrieve the nearest neighbors of those source nodes that contain the given keywords. For example, when a tourist has a plan to visit several places (source nodes), he/she would like to search hotels with some preferred features (e.g., Internet and swimming pools). It is highly desirable to recommend a list of near hotels with those preferred features, in order of the road network distance to the places (source nodes) the tourist wants to visit. The existing approach by graph traversal at querying time requires long query processing time, and the approach by maintenance of the pre-computed all-pairs shortest distances requires huge storage space on disk. In this paper, we propose new approaches to support on-line preferential NN browsing. The data graphs we are dealing with are weighted directed graphs where nodes are associated with attributes, and the distances between nodes to be found are the exact distances in the graph. We focus ourselves on two-step approaches. In the first step, we identify a number of reference nodes (also called centers) which exist alone on some shortest paths between a source node and a preferential NN node that contains the user-given keywords. In the second step, we find the preferential NN nodes within a certain distance to the source nodes via the relevant reference nodes, using an index that supports both textural (attributes) and and the distance. Our approach tightly integrates NN search with the preference search, which is confirmed to be efficient and effective to find any preferential NN nodes.