A conceptual clustering approach for user profiling in personal information agents

Authors:
Daniela Godoy;Analía Amandi
Affiliations:
ISISTAN Research Institute, Univ. Nacional del Centro de la Prov. de Bs. As., Campus Universitario, Paraje Arroyo Seco, CP 7000, Tandil, Bs. As., Argentina Also at CONICET, Argentina E-mail: {dgod ...;ISISTAN Research Institute, Univ. Nacional del Centro de la Prov. de Bs. As., Campus Universitario, Paraje Arroyo Seco, CP 7000, Tandil, Bs. As., Argentina Also at CONICET, Argentina E-mail: {dgod ...
Venue:
AI Communications
Year:
2006

Citing 32
Cited 2

Concept formation in structured domains

Concept formation knowledge and experience in unsupervised learning
Information retrieval: data structures and algorithms

Information retrieval: data structures and algorithms
Ordering effects in clustering

ML92 Proceedings of the ninth international workshop on Machine learning
Artificial intelligence: a modern approach

Artificial intelligence: a modern approach
Scatter/gather browsing communicates the topic structure of a very large text collection

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Training algorithms for linear text classifiers

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Learning and Revising User Profiles: The Identification ofInteresting Web Sites

Machine Learning - Special issue on multistrategy learning
WebMate: a personal agent for browsing and searching

AGENTS '98 Proceedings of the second international conference on Autonomous agents
Inductive learning algorithms and representations for text categorization

Proceedings of the seventh international conference on Information and knowledge management
Fast and effective text mining using linear-time document clustering

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
On the merits of building categorization systems by supervised clustering

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
A hybrid user model for news story classification

UM '99 Proceedings of the seventh international conference on User modeling
Hierarchical classification of Web content

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Fast supervised dimensionality reduction algorithm with applications to document categorization & retrieval

Proceedings of the ninth international conference on Information and knowledge management
A vector space model for automatic indexing

Communications of the ACM
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Improving hierarchical text classification using unlabeled data

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Unsupervised learning of probabilistic concept hierarchies

Machine Learning and Its Applications
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Hierarchical Text Categorization Using Neural Networks

Information Retrieval
Machine Learning for User Modeling

User Modeling and User-Adapted Interaction
Amalthaea: An Evolving Multi-Agent Information Filtering and Discovery System for the WWW

Autonomous Agents and Multi-Agent Systems
Text-Learning and Related Intelligent Agents: A Survey

IEEE Intelligent Systems
Knowledge Acquisition Via Incremental Conceptual Clustering

Machine Learning
Feature selection on hierarchy of web documents

Decision Support Systems - Web retrieval and mining
Semi-supervised Clustering by Seeding

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Hierarchically Classifying Documents Using Very Few Words

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Improving Text Classification by Shrinkage in a Hierarchy of Classes

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
An Incremental Approach to Building a Cluster Hierarchy

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
User Profiling for Web Page Filtering

IEEE Internet Computing
Letizia: an agent that assists web browsing

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
An information-theoretic external cluster-validity measure

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence

Intelligent user profiling

Artificial intelligence
Enabling topic-level trust for collaborative information sharing

Personal and Ubiquitous Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Information agents have emerged in the last decade as an alternative to assist users to cope with the increasing volume of information available on the Web. In order to provide personalized assistance, these agents rely on having some knowledge about users contained into user profiles, i.e., models of users preferences and interests gathered by observation of user behavior. User profiles have to summarize categories corresponding not only to diverse user information interests but also to different levels of abstraction in order to allow agents to decide on the relevance of new pieces of information. In accomplishing this goal, the discovery of interest categories using document clustering offers the advantage that an a priori knowledge of user interests is not needed, therefore the process of acquiring profiles is completely unsupervised. However, most document clustering algorithms are not applicable to the problem of incrementally acquiring and modeling interests because of either the kind of solutions they provide, which do not resemble user interests, or the way they build such solutions, which is generally not incremental. In this paper we describe and evaluate a document clustering algorithm, named WebDCC (Web Document Conceptual Clustering), designed to support learning of user interests by personal information agents. WebDCC algorithm carries out incremental, unsupervised concept learning over Web documents with the goal of building and maintaining both accurate and comprehensible user profiles. Empirical evaluation of using this algorithm for user profiling and its advantages with respect to other clustering algorithms are presented.