Intelligent Indexing of Crime Scene Photographs
IEEE Intelligent Systems
Events Extraction and Classification for Arabic Information Retrieval Systems
ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
MUC4 '92 Proceedings of the 4th conference on Message understanding
Extracting meaningful entities from police narrative reports
dg.o '02 Proceedings of the 2002 annual national conference on Digital government research
Towards including prosody in a text-to-speech system for modern standard Arabic
Computer Speech and Language
Using NLP Techniques for Tagging Events in Arabic Text
ICTAI '07 Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence - Volume 02
dg.o '08 Proceedings of the 2008 international conference on Digital government research
An Analysis of Data Mining Applications in Crime Domain
CITWORKSHOPS '08 Proceedings of the 2008 IEEE 8th International Conference on Computer and Information Technology Workshops
Arabic Named Entity Recognition from Diverse Text Types
GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
ANERsys: An Arabic Named Entity Recognition System Based on Maximum Entropy
CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
Cluster-Centric Approach to News Event Extraction
Proceedings of the 2008 conference on New Trends in Multimedia and Network Information Systems
TAGARAB: a fast, accurate Arabic name recognizer using high-precision morphological analysis
Semitic '98 Proceedings of the Workshop on Computational Approaches to Semitic Languages
Classifying Amharic news text using self-organizing maps
Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages
LoLo: a system based on terminology for multilingual extraction
IEBeyondDoc '06 Proceedings of the Workshop on Information Extraction Beyond The Document
Person name entity recognition for Arabic
Semitic '07 Proceedings of the 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources
Crime Type Document Classification from Arabic Corpus
DESE '09 Proceedings of the 2009 Second International Conference on Developments in eSystems Engineering
Semi-supervised named entity recognition: learning to recognize 100 entity types with little supervision
Automatically Constructing Dictionaries for Extracting Meaningful Crime Information from Arabic Text
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Automatically constructing a dictionary for information extraction tasks
AAAI'93 Proceedings of the eleventh national conference on Artificial intelligence
Simplified feature set for Arabic named entity recognition
NEWS '10 Proceedings of the 2010 Named Entities Workshop
Arabic Named Entity Recognition: A Feature-Driven Study
IEEE Transactions on Audio, Speech, and Language Processing
Named entity recognition for Arabic using syntactic grammars
NLDB'07 Proceedings of the 12th international conference on Applications of Natural Language to Information Systems
Hi-index | 0.00 |
Arabic is a widely spoken language but few mining tools have been developed to process Arabic text. This paper examines the crime domain in the Arabic language (unstructured text) using text mining techniques. The development and application of a Crime Profiling System (CPS) is presented. The system is able to extract meaningful information, in this case the type of crime, location and nationality, from Arabic language crime news reports. The system has two unique attributes; firstly, information extraction that depends on local grammar, and secondly, dictionaries that can be automatically generated. It is shown that the CPS improves the quality of the data through reduction where only meaningful information is retained. Moreover, the Self Organising Map (SOM) approach is adopted in order to perform the clustering of the crime reports, based on crime type. This clustering technique is improved because only refined data containing meaningful keywords extracted through the information extraction process are inputted into it, i.e. the data are cleansed by removing noise. The proposed system is validated through experiments using a corpus collated from different sources; it was not used during system development. Precision, recall and F-measure are used to evaluate the performance of the proposed information extraction approach. Also, comparisons are conducted with other systems. In order to evaluate the clustering performance, three parameters are used: data size, loading time and quantization error.