The shark-search algorithm. An application: tailored Web site mapping
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Focused crawling: a new approach to topic-specific Web resource discovery
WWW '99 Proceedings of the eighth international conference on World Wide Web
EuroWordNet: a multilingual database with lexical semantic networks
EuroWordNet: a multilingual database with lexical semantic networks
A vector space model for automatic indexing
Communications of the ACM
Intelligent crawling on the World Wide Web with arbitrary predicates
Proceedings of the 10th international conference on World Wide Web
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Ontology-focused crawling of Web documents
Proceedings of the 2003 ACM symposium on Applied computing
Topical web crawlers: Evaluating adaptive algorithms
ACM Transactions on Internet Technology (TOIT)
Centroid-based summarization of multiple documents
Information Processing and Management: an International Journal
Focused crawling by exploiting anchor text using decision tree
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Learning to crawl: Comparing classification schemes
ACM Transactions on Information Systems (TOIS)
Link Contexts in Classifier-Guided Topical Crawlers
IEEE Transactions on Knowledge and Data Engineering
Improving the performance of focused web crawlers
Data & Knowledge Engineering
A focused crawler for Dark Web forums
Journal of the American Society for Information Science and Technology
SeaLab Advanced Information Retrieval
ICSC '10 Proceedings of the 2010 IEEE Fourth International Conference on Semantic Computing
Semantic oriented clustering of documents
ISNN'11 Proceedings of the 8th international conference on Advances in neural networks - Volume Part III
Semantic Models for Style-Based Text Clustering
ICSC '11 Proceedings of the 2011 IEEE Fifth International Conference on Semantic Computing
SURF: speeded up robust features
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I
Web content management by self-organization
IEEE Transactions on Neural Networks
Textual and Visual Content-Based Anti-Phishing: A Bayesian Approach
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
The paper presents a general methodology to implement a flexible Focused Crawler for investigation purposes, monitoring, and Open Source Intelligence (OSINT). The resulting tool is specifically aimed to fit the operational requirements of law-enforcement agencies and intelligence analyst. The architecture of the semantic Focused Crawler features static flexibility in the definition of desired concepts, used metrics, and crawling strategy; in addition, the method is capable to learn (and adapt to) the analyst's expectations at runtime. The user may instruct the crawler with a binary feedback (yes/no) about the current performance of the surfing process, and the crawling engine progressively refines the expected targets accordingly. The method implementation is based on an existing text-mining environment, integrated with semantic networks and ontologies. Experimental results witness the effectiveness of the adaptive mechanism.