Privacy-enhanced string matching with wordwise positional sampling

  • Authors:
  • Sung-Hwan Kim;Dae-Geon Kwon;Hwan-Gue Cho

  • Affiliations:
  • Pusan National University, Busan, South Korea;Pusan National University, Busan, South Korea;Pusan National University, Busan, South Korea

  • Venue:
  • Proceedings of the 8th International Conference on Ubiquitous Information Management and Communication
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data anonymization is an important task for protecting privacy in data mining and processing. With the daily production data through web services and social networks, text anonymization has become an essential technique. In this paper, we present an anonymization method for privacy-enhanced string matching in natural language texts. Given a document comprised of words and separators, our method samples characters in particular positions for each word according to a given seed. String indexing and matching processes are performed on this positionally sampled text; therefore it protects the original text from exposure while retaining the matching statistics of pattern strings. In addition, we define measures for seed performance in data utility and privacy protection, while investigating which seeds provide improved performance in terms of the measures we have defined.