Text analysis for detecting terrorism-related articles on the web

Authors:
Dongjin Choi;Byeongkyu Ko;Heesun Kim;Pankoo Kim
Affiliations:
-;-;-;-
Venue:
Journal of Network and Computer Applications
Year:
2014

Citing 15
Cited 1

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
WordNet: a lexical database for English

Communications of the ACM
On domain knowledge and feature selection using a support vector machine

Pattern Recognition Letters
Verbs semantics and lexical selection

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Document preprocessing for naive Bayes classification and clustering with mixture of multinomials

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
A New Methodology for Merging the Heterogeneous Domain Ontologies Based on the WordNet

NWESP '05 Proceedings of the International Conference on Next Generation Web Services Practices
Taking advantages of a disadvantage: Digital forensics and steganography using document metadata

Journal of Systems and Software
Latent semantic analysis for text categorization using neural network

Knowledge-Based Systems
Computer Crime Investigation by Means of Fuzzy Semantic Maps

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Security and privacy issues in the Portable Document Format

Journal of Systems and Software
Text Editor based on Google Trigram and its Usability

EMS '10 Proceedings of the 2010 Fourth UKSim European Symposium on Computer Modeling and Simulation
Automatic Enrichment of Semantic Relation Network and Its Application to Word Sense Disambiguation

IEEE Transactions on Knowledge and Data Engineering
Domain N-gram construction and its application to text editor

ACIIDS'11 Proceedings of the Third international conference on Intelligent information and database systems - Volume Part I
Information Retrieval Techniques to Grasp User Intention in Pervasive Computing Environment

IMIS '11 Proceedings of the 2011 Fifth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing
Automatic Evaluation of Document Classification Using N-Gram Statistics

NBIS '12 Proceedings of the 2012 15th International Conference on Network-Based Information Systems

Editorial: Advanced technologies for homeland defense and security

Journal of Network and Computer Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Classifying web documents is considered as one of the most important tasks to reveal the terrorism-related documents. Internet provides a lot of valuable information to the users and the amount of web contents is progressively increasing. This makes it very difficult to identify potentially dangerous documents. Simply extracting keywords from documents is not enough to classify the contents. To build automated document classification systems, many techniques have been studied so far, but they are mostly statistical and knowledge-based approaches. These methods, however, do not yield satisfactory results because of complexity of natural languages. To overcome this deficiency, we propose a method to use word similarity based on WordNet hierarchy and n-gram data frequency. This method was tested with the sampled New York Times articles by querying four distinct words from four different areas. Experimental results show our proposed method effectively extracts context words from the text and identifies terrorism-related documents.