Exploiting extremely rare features in text categorization
ECML'06 Proceedings of the 17th European conference on Machine Learning
Hi-index | 0.00 |
A key Scientific Web Intelligence goal is to produce visualizations of academic Web spaces in order to reveal subject structures and identify relationships between different fields. We introduce an exploratory technique, Vocabulary Spectral Analysis, designed to build intuition to help design effective procedures for clustering. We apply VSA to New Zealand university web sites. The results suggest that subject-based academic Web site clustering is possible but will require extensive data filtering to give effective subject-based clusters.