On-line preferential nearest neighbor browsing in large attributed graphs

Authors:
Jiefeng Cheng;Jeffrey Xu Yu;Reynold C. K. Cheng
Affiliations:
University of Hong Kong, China;The Chinese University of Hong Kong, China;University of Hong Kong, China
Venue:
DASFAA'10 Proceedings of the 15th international conference on Database systems for advanced applications
Year:
2010

Citing 23
Cited 0

Efficient management of transitive relationships in large data and knowledge bases

SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
A compression technique to materialize transitive closure

ACM Transactions on Database Systems (TODS)
Approximate distance oracles

STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Reachability and distance queries via 2-hop labels

SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
I/O-Efficiency of Shortest Path Algorithms: An Analysis

Proceedings of the Eighth International Conference on Data Engineering
Distance labeling in graphs

Journal of Algorithms
Efficient Creation and Incremental Maintenance of the HOPI Index for Complex XML Document Collections

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Stack-based algorithms for pattern matching on DAGs

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Answering distance queries in directed graphs using fast matrix multiplication

FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
Using structure indices for efficient approximation of network properties

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Distance indexing on road networks

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
BLINKS: ranked keyword searches on graphs

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Fast and practical indexing and querying of very large graphs

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Query processing in spatial network databases

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Fast computing reachability labelings for large graphs with high compression rate

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Scalable network distance browsing in spatial databases

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Efficient algorithms for exact ranked twig-pattern matching over graphs

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Efficiently answering reachability queries on very large directed graphs

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Proximity-preserving labeling schemes

Journal of Graph Theory
On-line exact shortest distance query processing

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Fast Graph Pattern Matching

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Distance Oracles for Spatial Networks

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Distance-join: pattern match query in a large graph database

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given a large weighted directed graph where nodes are associated with attributes and edges are weighted, we study a new problem, called preferential nearest neighbors (NN) browsing, in this paper. In such browsing, a user may provide one or more source nodes and some keywords to retrieve the nearest neighbors of those source nodes that contain the given keywords. For example, when a tourist has a plan to visit several places (source nodes), he/she would like to search hotels with some preferred features (e.g., Internet and swimming pools). It is highly desirable to recommend a list of near hotels with those preferred features, in order of the road network distance to the places (source nodes) the tourist wants to visit. The existing approach by graph traversal at querying time requires long query processing time, and the approach by maintenance of the pre-computed all-pairs shortest distances requires huge storage space on disk. In this paper, we propose new approaches to support on-line preferential NN browsing. The data graphs we are dealing with are weighted directed graphs where nodes are associated with attributes, and the distances between nodes to be found are the exact distances in the graph. We focus ourselves on two-step approaches. In the first step, we identify a number of reference nodes (also called centers) which exist alone on some shortest paths between a source node and a preferential NN node that contains the user-given keywords. In the second step, we find the preferential NN nodes within a certain distance to the source nodes via the relevant reference nodes, using an index that supports both textural (attributes) and and the distance. Our approach tightly integrates NN search with the preference search, which is confirmed to be efficient and effective to find any preferential NN nodes.