The Social Spiders in the Clustering of Texts: Towards an Aspect of Visual Classification
International Journal of Artificial Life Research
International Journal of Information Retrieval Research
Hi-index | 0.00 |
In this article we present a 2D cellular automaton (Class_AC) to solve a problem of text mining in the case of unsupervised classification (clustering). Before to experiment the cellular automaton, we vectorized our data indexing textual documents from the database REUTERS 21,578 by the approach of N-grams. The cellular automaton that we propose in this paper is a grid cell structure with a flat neighborhood arising from this structure (planar). Three functions of transitions were used to vary the automaton with four states for each cell. The results obtained show that the virtual machine parallel computing (Class_AC) effectively includes similar documents on near threshold. Section 1 gives an introduction, Section 2 presents representation of texts based on the n grams, Section 3 describes the cellular automaton for clustering, Section 4 shows the experimentation and comparison results and finally Section 5 gives a conclusion and perspectives.