Query length impact on misuse detection in information retrieval systems

  • Authors:
  • Ling Ma;Nazli Goharian

  • Affiliations:
  • Illinois Institute of Technology;Illinois Institute of Technology

  • Venue:
  • Proceedings of the 2005 ACM symposium on Applied computing
  • Year:
  • 2005

Quantified Score

Hi-index 0.01

Visualization

Abstract

Misuse is the abuse of privileges by an authorized user and is the second most common form of computer crime after viruses. Earlier we proposed a misuse detection approach for information retrieval systems that relied on relevance feedback. The central idea focused on the building of a user profile containing both query and feedback terms from prior queries. Our algorithm matched new activities to existing profiles and assigned a likelihood of misuse to an activity. Only initial evaluation was provided.We now expand and evaluate our system using both short and long queries noting the effect of query length in the accuracy of the detection. The results indicate an overall precision of 83.9% when short queries are used, and 82.2% for long queries. The rate of the undetected misuse for short queries is less than 2% and for long queries less than 6%. Although higher precision score configurations result in a lower false alarm rate, unfortunately, they increase the rate of undetected misuse both for short and long queries. Given this tradeoff, for any particular application constraint, system behavior can be tuned to minimize either false alarms or undetected misuse.