C4.5: programs for machine learning
C4.5: programs for machine learning
Automatic indexing based on Bayesian inference networks
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic learning for selective dissemination of information
Information Processing and Management: an International Journal
Adaptive information filtering using evolutionary computation
Information Sciences: an International Journal - Special issue on frontiers in evolutionary algorithms
Personalization on the Net using Web mining: introduction
Communications of the ACM
Hierarchical classification of Web content
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Theme-based retrieval of Web news (poster session)
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Data mining: concepts and techniques
Data mining: concepts and techniques
A vector space model for automatic indexing
Communications of the ACM
Proceedings of the tenth international conference on Information and knowledge management
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Machine Learning
Harvesting translingual vocabulary mappings for multilingual digital libraries
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Knowledge Discovery and Data Mining: The Info-Fuzzy Network (Ifn) Methodology
Knowledge Discovery and Data Mining: The Info-Fuzzy Network (Ifn) Methodology
Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
A Study of Approaches to Hypertext Categorization
Journal of Intelligent Information Systems
Maximizing Text-Mining Performance
IEEE Intelligent Systems
Machine Learning
Web mining for web personalization
ACM Transactions on Internet Technology (TOIT)
The Use of NLP Techniques in CLIR
CLEF '00 Revised Papers from the Workshop of Cross-Language Evaluation Forum on Cross-Language Information Retrieval and Evaluation
gSpan: Graph-Based Substructure Pattern Mining
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Term Weighting Approaches in Automatic Text Retrieval
Term Weighting Approaches in Automatic Text Retrieval
SEWeP: using site semantics and a taxonomy to enhance the Web personalization process
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Evaluating adaptive user profiles for news classification
Proceedings of the 9th international conference on Intelligent user interfaces
Automatic web news extraction using tree edit distance
Proceedings of the 13th international conference on World Wide Web
Language-specific models in multilingual topic tracking
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Graph-Theoretic Techniques for Web Content Mining
Graph-Theoretic Techniques for Web Content Mining
A simple, structure-sensitive approach for web document classification
AWIC'05 Proceedings of the Third international conference on Advances in Web Intelligence
A multilingual knowledge management system: A case study of FAO and WAICENT
Decision Support Systems
Multilingual knowledge management
Artificial intelligence
The Effect of Stemming on Arabic Text Classification: An Empirical Study
International Journal of Information Retrieval Research
Hi-index | 0.00 |
Since the web is increasingly used by terrorist organizations for propaganda, disinformation, and other purposes, the ability to automatically detect terrorist-related content in multiple languages can be extremely useful. In this paper we describe a new, classification-based approach to multi-lingual detection of terrorist documents. The proposed approach builds upon the recently developed graph-based web document representation model combined with the popular C4.5 decision-tree classification algorithm. Evaluation is performed on a collection of 648 web documents in Arabic language. The results demonstrate that documents downloaded from several known terrorist sites can be reliably discriminated from the content of Arabic news reports using a simple decision tree.