A technique for measuring the relative size and overlap of public Web search engines
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Privacy-preserving data mining
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
k-anonymity: a model for protecting privacy
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Disclosure Limitation of Sensitive Rules
KDEX '99 Proceedings of the 1999 Workshop on Knowledge and Data Engineering Exchange
IEEE Transactions on Knowledge and Data Engineering
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Random sampling from a search engine's index
Proceedings of the 15th international conference on World Wide Web
Towards robustness in query auditing
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
An integer programming approach for frequent itemset hiding
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
L-diversity: Privacy beyond k-anonymity
ACM Transactions on Knowledge Discovery from Data (TKDD)
Efficient search engine measurements
Proceedings of the 16th international conference on World Wide Web
"I know what you did last summer": query logs and user privacy
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Estimating the impressionrank of web pages
Proceedings of the 18th international conference on World wide web
Privacy preservation of aggregates in hidden databases: why and how?
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Effective anonymization of query logs
Proceedings of the 18th ACM conference on Information and knowledge management
Generalized distances between rankings
Proceedings of the 19th international conference on World wide web
Website privacy preservation for query log publishing
PinKDD'07 Proceedings of the 1st ACM SIGKDD international conference on Privacy, security, and trust in KDD
Mining a search engine's corpus: efficient yet unbiased sampling and aggregate estimation
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Calibrating noise to sensitivity in private data analysis
TCC'06 Proceedings of the Third conference on Theory of Cryptography
Mining a search engine's corpus without a query pool
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
Many enterprise websites provide search engines to facilitate customer access to their underlying documents or data. With the web interface of such a search engine, a customer can specify one or a few keywords that he/she is interested in; and the search engine returns a list of documents/tuples matching the user-specified keywords, sorted by an often-proprietary scoring function. It was traditionally believed that, because of its highly-restrictive interface (i.e., keyword search only, no SQL-style queries), such a search engine serves its purpose of answering individual keyword-search queries without disclosing big-picture aggregates over the data which, as we shall show in the paper, may incur significant privacy concerns to the enterprise. Nonetheless, recent work on sampling and aggregate estimation over a search engine's corpus through its keyword-search interface transcends this traditional belief. In this paper, we consider a novel problem of suppressing sensitive aggregates for enterprise search engines while maintaining the quality of answers provided to individual keyword-search queries. We demonstrate the effectiveness and efficiency of our novel techniques through theoretical analysis and extensive experimental studies.