Using clustering techniques to detect usage patterns in a Web-based information system

Authors:
Affiliations:
Venue:
Journal of the American Society for Information Science and Technology
Year:
2001

Citing 8
Cited 10

Algorithms for clustering data

Algorithms for clustering data
Markov models of search state patterns in a hypertext information retrieval system

Journal of the American Society for Information Science
A study of the use of variables in information retrieval user studies

Journal of the American Society for Information Science
Real life, real users, and real needs: a study and analysis of user queries on the web

Information Processing and Management: an International Journal
Usage patterns of a Web-based library catalog

Journal of the American Society for Information Science and Technology
Searching the Web: the public and their queries

Journal of the American Society for Information Science and Technology
Predicting the relevance of a library catalog search

Journal of the American Society for Information Science and Technology - Visual based retrieval systems and web mining
An analytical approach to deriving usage patterns in a web-based information system

An analytical approach to deriving usage patterns in a web-based information system

Stochastic modeling of usage patterns in a Web-based information system

Journal of the American Society for Information Science and Technology
Co-evolution of user and organizational interfaces: a longitudinal case study of WWW dissemination of national statistics

Journal of the American Society for Information Science and Technology
How are we searching the world wide web?: a comparison of nine search engine transaction logs

Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Word usage and posting behaviors: modeling blogs with unobtrusive data collection methods

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Identifying clusters of user behavior in intranet search engine log files

Journal of the American Society for Information Science and Technology
How are we searching the World Wide Web? A comparison of nine search engine transaction logs

Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Beyond text querying and ranking list: how people are searching through faceted catalogs in two library environments

Proceedings of the 73rd ASIS&T Annual Meeting on Navigating Streams in an Information Ecosystem - Volume 47
Modelling user behaviour and experience: the R2D2 networks approach

DUXU'13 Proceedings of the Second international conference on Design, User Experience, and Usability: design philosophy, methods, and tools - Volume Part I
Modeling search processes using hidden states in collaborative exploratory web search

Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing
The use of query suggestions during information search

Information Processing and Management: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Different users of a Web-based information system will have different goals and different ways of performing their work. This article explores the possibility that we can automatically detect usage patterns without demographic information about the individuals. First, a set of 47 variables was defined that can be used to characterize a user session. The values of these variables were computed for approximately 257,000 sessions. Second, principal component analysis was employed to reduce the dimensions of the original data set. Third, a two-stage, hybrid clustering method was proposed to categorize sessions into groups. Finally, an external criteria-based test of cluster validity was performed to verify the validity of the resulting usage groups (clusters). The proposed methodology was demonstrated and tested for validity using two independent samples of user sessions drawn from the transaction logs of the University of California's MELVYL on-line library catalog system (www.melvyl.ucop.edu). The results indicate that there were six distinct categories of use in the MELVYL system: knowledgeable and sophisticated use, unsophisticated use, highly interactive use with good search performance, known-item searching, help-intensive searching, and relatively unsuccessful use. Their characteristics were interpreted and compared qualitatively. The analysis shows that each group had distinct patterns of use of the system, which justifies the methodology employed in this study.