The shark-search algorithm. An application: tailored Web site mapping
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Focused crawling: a new approach to topic-specific Web resource discovery
WWW '99 Proceedings of the eighth international conference on World Wide Web
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Centroid-Based Document Classification: Analysis and Experimental Results
PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Web site mining: a new way to spot competitors, customers and suppliers in the world wide web
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Ontology-focused crawling of Web documents
Proceedings of the 2003 ACM symposium on Applied computing
Web service-oriented manufacturing resource applications for networked product development
Advanced Engineering Informatics
Hi-index | 0.00 |
Domain-specific website recognition is a key issue for specific web resources available. The same topic websites are similar in the content structures and textual contents. According to vector space model, hybrid vector space model about website topic was proposed. This model exploited text feature instead of tree and graph ways to represent the website link structure. Its vector elements integrated text information about website content and structure characteristics extracted from relevant web pages. The topic of a website was identified through the centroid-based classification algorithm. The experiments of manufacturing-topic website recognition were implemented to verify the performances of this method. The results indicate that this model is suited to feature description of topic-specific websites. Moreover, it has good applicability of website classification on the Web.