A random indexing approach for web user clustering and web prefetching

Authors:
Miao Wan;Arne Jönsson;Cong Wang;Lixiang Li;Yixian Yang
Affiliations:
Information Security Center, State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China;Department of Computer and Information Science, Linköping University, Linköping, Sweden;Information Security Center, State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China;Information Security Center, State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China;Information Security Center, State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China
Venue:
PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
Year:
2011

Citing 15
Cited 0

The World-Wide Web: quagmire or gold mine?

Communications of the ACM
Web user clustering from access log using belief function

Proceedings of the 1st international conference on Knowledge capture
Sparse Distributed Memory

Sparse Distributed Memory
Dynamic Restructuring of E-Catalog Communities Based on User Interaction Patterns

World Wide Web
Using Site Semantics to Analyze, Visualize, and Support Navigation

Data Mining and Knowledge Discovery
Prediction of Web Page Accesses by Proxy Server Log

World Wide Web
Integrating E-Commerce and Data Mining: Architecture and Challenges

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Quality Scheme Assessment in the Clustering Process

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Effectively Finding Relevant Web Pages from Linkage Information

IEEE Transactions on Knowledge and Data Engineering
Characteristics of WWW Client-based Traces

Characteristics of WWW Client-based Traces
Integrating Web Caching and Web Prefetching in Client-Side Proxies

IEEE Transactions on Parallel and Distributed Systems
Automatic bilingual lexicon acquisition using random indexing of parallel corpora

Natural Language Engineering
Random indexing using statistical weight functions

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
In-depth behavior understanding and use: The behavior informatics approach

Information Sciences: an International Journal
Low-complexity fuzzy relational clustering algorithms for Web mining

IEEE Transactions on Fuzzy Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present a novel technique to capture Web users' behaviour based on their interest-oriented actions. In our approach we utilise the vector space model Random Indexing to identify the latent factors or hidden relationships among Web users' navigational behaviour. Random Indexing is an incremental vector space technique that allows for continuous Web usage mining. User requests are modelled by Random Indexing for individual users' navigational pattern clustering and common user profile creation. Clustering Web users' access patterns may capture common user interests and, in turn, build user profiles for advanced Web applications, such as Web caching and prefetching. We present results from the Web user clustering approach through experiments on a real Web log file with promising results. We also apply our data to a prefetching task and compare that with previous approaches. The results show that Random Indexing provides more accurate prefetchings.