Improving classification based off-topic search detection via category relationships

  • Authors:
  • Alana Platt;Saket S. R. Mengle;Nazli Goharian

  • Affiliations:
  • Illinois Institute of Technology, Chicago, Illinois;Illinois Institute of Technology, Chicago, Illinois;Illinois Institute of Technology, Chicago, Illinois

  • Venue:
  • Proceedings of the 2009 ACM symposium on Applied Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The illegitimate access of documents by insiders (also known as off-topic search) is an increasingly prevalent and largely ignored problem. We propose an approach that uses text classification for off-topic search detection. Our empirical results indicate that off-topic search detection effectiveness improves by considering only a subset of documents that are retrieved for a given user query. Furthermore, we also show that the effectiveness of off-topic search detection improves by using the ontological information of document categories. Our empirical results demonstrate that utilizing sibling relationship information and relationships derived from misclassification information statistically significantly improves the results over the baseline in most cases.