Incremental personalized web page mining utilizing self-organizing HCMAC neural network

  • Authors:
  • Chih-Ming Chen

  • Affiliations:
  • Graduate Institute of Learning Technology, National Hualien Teachers College, 123 Hua-His Rd., Hualien, Taiwan

  • Venue:
  • Web Intelligence and Agent Systems
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

In recent years, information has grown rapidly, especially on the World Wide Web. Also volume of information found by search engines tends to be large, and these documents are not tailored to a user's actual needs and interests. Thus, to offer the personalized service that includes only user interested information becomes increasingly important. Web mining techniques have proven themselves as a very useful tool for mining information of interests on the Web. However, past pioneers' studies have indicated that the main challenges in Web mining are in terms of handling high-dimensional data, achieving incremental learning (or incremental mining), providing scalable mining and parallel and distributed mining algorithms. This study presents a novel self-organizing HCMAC (Hierarchical Cerebellar Model Arithmetic Computer) neural network composed of two-dimensional Weighted Grey CMACs (WGCMAC) capable of handling both higher dimensional classification problems and self-organizing memory structure according to the distribution of training patterns. Moreover, a learning algorithm that can learn incrementally from new added data without forgetting prior knowledge is proposed to train the self-organizing HCMAC neural network. Currently, it is applied to incrementally learn user profiles from user feedback for identifying personalized Web pages. A benchmark dataset of Web pages ratings that contains four topics of user profiles is used to demonstrate the effectiveness of the proposed method. Experimental results show that the self-organizing HCMAC neural network has a good incrementally learning ability and can overcome the problem of enormous memory requirement in the conventional CMAC while it is applied to solve the higher dimensional classification problems. Furthermore, experiments also confirm that the self-organizing HCMAC neural network has a better forecasting ability to identify user interesting Web pages than other well-known classifiers do.