Context-sensitive document ranking

Authors:
Li-Jun Chang;Jeffrey Xu Yu;Lu Qin
Affiliations:
Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong, China;Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong, China;Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong, China
Venue:
Journal of Computer Science and Technology
Year:
2010

Citing 35
Cited 0

The merge/purge problem for large databases

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
IR evaluation methods for retrieving highly relevant documents

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Retrieving and organizing web pages by “information unit”

Proceedings of the 10th international conference on World Wide Web
Modern Information Retrieval

Modern Information Retrieval
Approximate String Joins in a Database (Almost) for Free

Proceedings of the 27th International Conference on Very Large Data Bases
On Approximate String Matching

Proceedings of the 1983 International FCT-Conference on Fundamentals of Computation Theory
Robust and efficient fuzzy match for online data cleaning

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
DBXplorer: A System for Keyword-Based Search over Relational Databases

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Keyword Searching and Browsing in Databases using BANKS

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Bidirectional expansion for keyword search on graph databases

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Learning to rank using gradient descent

ICML '05 Proceedings of the 22nd international conference on Machine learning
Working Models for Uncertain Data

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Effective keyword search in relational databases

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
ULDBs: databases with uncertainty and lineage

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Trio: a system for data, uncertainty, and lineage

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Scaling up all pairs similarity search

Proceedings of the 16th international conference on World Wide Web
Spark: top-k keyword query in relational databases

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
BLINKS: ranked keyword searches on graphs

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Management of probabilistic data: foundations and challenges

Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficient query evaluation on probabilistic databases

The VLDB Journal — The International Journal on Very Large Data Bases
Eliminating fuzzy duplicates in data warehouses

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Discover: keyword search in relational databases

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Efficient IR-style keyword search over relational databases

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Efficient similarity joins for near duplicate detection

Proceedings of the 17th international conference on World Wide Web
Ranking queries on uncertain data: a probabilistic threshold approach

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
EASE: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Keyword proximity search in complex data graphs

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Query dependent ranking using K-nearest neighbor

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Sliding-window top-k queries on uncertain streams

Proceedings of the VLDB Endowment
Are click-through data adequate for learning web search rankings?

Proceedings of the 17th ACM conference on Information and knowledge management
Efficient Processing of Top-k Queries in Uncertain Databases with x-Relations

IEEE Transactions on Knowledge and Data Engineering
Top-k Spatial Joins of Probabilistic Objects

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Semantics of Ranking Queries for Probabilistic Data and Expected Ranks

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Querying Communities in Relational Databases

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Consensus answers for queries over probabilistic databases

Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Ranking is a main research issue in IR-styled keyword search over a set of documents. In this paper, we study a new keyword search problem, called context-sensitive document ranking, which is to rank documents with an additional context that provides additional information about the application domain where the documents are to be searched and ranked. The work is motivated by the fact that additional information associated with the documents can possibly assist users to find more relevant documents when they are unable to find the needed documents from the documents alone. In tRis paper, a context is a multi-attribute graph, which can represent any information maintained in a relational database, where multi-attribute nodes represent tuples, and edges represent primary key and foreign key references among nodes. The context-sensitive ranking is related to several research issues, how to score documents, how to evaluate the additional information obtained in the context that may contribute to the document ranking, how to rank the documents by combining the scores/costs from the documents and the context. More importantly, the relationships between documents and the information stored in a relational database may be uncertain, because they are from different data sources, and the relationships are determined systematically using similarity match which causes uncertainty. In this paper, we concentrate ourselves on these research issues, and provide our solution on how to rank the documents in a context where there exist uncertainty between the documents and the context. We confirm the effectiveness of our approaches by conducting extensive experimental studies using real datasets. We present our findings in this paper.