Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Grouper: a dynamic clustering interface to Web search results
WWW '99 Proceedings of the eighth international conference on World Wide Web
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
On clusterings: Good, bad and spectral
Journal of the ACM (JACM)
Learning to cluster web search results
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
From Hardware to Software to Knowware: IT's Third Liberation?
IEEE Intelligent Systems
A divide-and-merge methodology for clustering
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A survey of Web clustering engines
ACM Computing Surveys (CSUR)
Carrot2 and language properties in web search results clustering
AWIC'03 Proceedings of the 1st international Atlantic web intelligence conference on Advances in web intelligence
Carrot2: design of a flexible and efficient web information retrieval framework
AWIC'05 Proceedings of the Third international conference on Advances in Web Intelligence
Hi-index | 0.00 |
This paper proposed a knowware based supervised machine learning technique for domain specific regression and classification of Web documents. It is simple because it is only based on word counting techniques without natural language understanding and complicated statistic techniques. Starting from constructing a domain sub-division tree and assigning a training set of documents to its nodes, the algorithm produces a labeled classification tree with a characteristic vector for each node. This tree is used to classify any number of documents in that particular domain. A tool for developing Web portal is also provided to build a Web station for displaying the final treelike library of documents.