A training algorithm for optimal margin classifiers
COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Dynamic reference sifting: a case study in the homepage domain
Selected papers from the sixth international conference on World Wide Web
Focused crawling: a new approach to topic-specific Web resource discovery
WWW '99 Proceedings of the eighth international conference on World Wide Web
MetaSpider: meta-searching and categorization on the Web
Journal of the American Society for Information Science and Technology
Domain-Specific Web Search with Keyword Spices
IEEE Transactions on Knowledge and Data Engineering
An ontology-based approach to learnable focused crawling
Information Sciences: an International Journal
Ontology-Based Focused Crawling
EKNOW '09 Proceedings of the 2009 International Conference on Information, Process, and Knowledge Management
AQUAM: automatic query formulation architecture for mobile applications
Proceedings of the 7th International Conference on Mobile and Ubiquitous Multimedia
A machine learning approach to building domain-specific search engines
IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Keyword spices: a new method for building domain-specific web search engines
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
KX: A flexible system for keyphrase extraction
SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Hi-index | 0.00 |
Analysis and processing of environmental information is considered of utmost importance for humanity. This article addresses the problem of discovery of web resources that provide environmental measurements. Towards the solution of this domain-specific search problem, we combine state-of-the-art search techniques together with advanced textual processing and supervised machine learning. Specifically, we generate domain-specific queries using empirical information and machine learning driven query expansion in order to enhance the initial queries with domain-specific terms. Multiple variations of these queries are submitted to a general-purpose web search engine in order to achieve a high recall performance and we employ a post processing module based on supervised machine learning to improve the precision of the final results. In this work, we focus on the discovery of weather forecast websites and we evaluate our technique by discovering weather nodes for south Finland.