Modeling DNS activities based on probabilistic latent semantic analysis

  • Authors:
  • Xuebiao Yuchi;Xiaodong Lee;Jian Jin;Baoping Yan

  • Affiliations:
  • China Internet Network Information Center, Computer Network Information Center, Chinese Academy of Sciences, Beijing, China and Graduate University of Chinese Academy of Sciences, Beijing, China;China Internet Network Information Center, Computer Network Information Center, Chinese Academy of Sciences, Beijing, China;China Internet Network Information Center, Computer Network Information Center, Chinese Academy of Sciences, Beijing, China;China Internet Network Information Center, Computer Network Information Center, Chinese Academy of Sciences, Beijing, China

  • Venue:
  • ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditional Web usage mining techniques aim at discovering usage patterns from Web data at the page level, while little work is engaged in at some upper level. In this paper, we propose a novel approach to the characterization of Internet users' preference and interests at the domain name level. By summarizing Internet user's domain name access behaviors as the cooccurrences of users and targeting domain names, an aspect model is introduced to classify users and domain names into various groups according to their cooccurrences. Meanwhile, each group is characterized by extracting the property of characteristic users and domain names. Experimental results on real-world data sets show that our approach is effective in which some meaningful groups are identified. Thus, our approach could be used for detecting unusual behaviors on the Internet at the domain name level, which can alleviate the work of searching the joint space of users and domain names.