Three-Tier Clustering: An Online Citation Clustering System

Authors:
Haifeng Jiang;Wenwu Lou;Wei Wang
Affiliations:
-;-;-
Venue:
WAIM '01 Proceedings of the Second International Conference on Advances in Web-Age Information Management
Year:
2001

Citing 13
Cited 0

Recent trends in hierarchic document clustering: a critical review

Information Processing and Management: an International Journal
Clustering algorithms

Information retrieval
Scatter/Gather: a cluster-based approach to browsing large document collections

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Cluster analysis for hypertext systems

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Constant interaction-time scatter/gather browsing of very large document collections

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Artificial intelligence: a modern approach

Artificial intelligence: a modern approach
CiteSeer: an automatic citation indexing system

Proceedings of the third ACM conference on Digital libraries
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Data clustering: a review

ACM Computing Surveys (CSUR)
Accessibility of information on the Web

intelligence
Clustering hypertext with applications to web searching

HYPERTEXT '00 Proceedings of the eleventh ACM on Hypertext and hypermedia
Exploiting causal independence in Bayesian network inference

Journal of Artificial Intelligence Research
On the role of context-specific independence in probabilistic inference

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present a three tier clustering method where data objects are described by a number of feature dimensions. Using the approach, similarity along each feature dimension of objects are first computed. The inter-objects similarity are then computed from inter-feature-dimension similarity using a Bayesian multi-causal model. Objects are finally clustered based on the computed similarity. An online citation entry clustering system was built using the approach. It accepts user queries in the form of name of authors. Such queries are sent to citation/bibliography search engines. The returned entries are clustered based on feature dimensions such as authors, title, place of publication, etc. After clustering, entries from different authors with the similar name form different clusters, that are presented to the user. Preliminary experiment results indicated the effectiveness of the proposed clustering approach. The architecture of three-tire clustering framework, feature representation of a citation entry, a brief network model for inter-object similarity computation, and a special cluster evaluation technique are discussed in detail.