Self organization of a massive document collection

Authors:
T. Kohonen;S. Kaski;K. Lagus;J. Salojarvi;J. Honkela;V. Paatero;A. Saarela
Affiliations:
Neural Networks Res. Centre, Helsinki Univ. of Technol., Espoo;-;-;-;-;-;-
Venue:
IEEE Transactions on Neural Networks
Year:
2000

Citing 0
Cited 187

Integrating automatic genre analysis into digital libraries

Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
Automatic text representation, classification and labeling in European law

Proceedings of the 8th international conference on Artificial intelligence and law
Overture

Self-Organizing neural networks
Parallel implementation of self-organizing maps

Self-Organizing neural networks
ICA and SOM in text document analysis

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
GS textplorer -: adaptive framework for information retrieval

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Data discretization for novel relationship discovery in information retrieval

Journal of the American Society for Information Science and Technology
Text Retrieval Using Self-Organized Document Maps

Neural Processing Letters
Unsupervised learning in neural computation

Theoretical Computer Science - Natural computing
Design and evaluation of a multi-agent collaborative Web mining system

Decision Support Systems - Web retrieval and mining
Content-based organization and visualization of music archives

Proceedings of the tenth ACM international conference on Multimedia
Topic Identification in Dynamical Text by Complexity Pursuit

Neural Processing Letters
On the Emulation of Kohonen's Self-Organization via Single-Map Metropolis-Hastings Algorithms

ICCS '01 Proceedings of the International Conference on Computational Science-Part II
A Parallel Implementation of the Tree-Structured Self-Organizing Map

PARA '02 Proceedings of the 6th International Conference on Applied Parallel Computing Advanced Scientific Computing
Technology of Text Mining

MLDM '01 Proceedings of the Second International Workshop on Machine Learning and Data Mining in Pattern Recognition
A Robust Meaning Extraction Methodology Using Supervised Neural Networks

AI '02 Proceedings of the 15th Australian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Retrieving News Stories from a News Integration Archive

ICADL '02 Proceedings of the 5th International Conference on Asian Digital Libraries: Digital Libraries: People, Knowledge, and Technology
A SOM Variant Based on the Wilcoxon Test for Document Organization and Retrieval

ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
An Efficiently Focusing Large Vocabulary Language Model

ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
Self Organizing Map and Sammon Mapping for Asymmetric Proximities

ICANN '01 Proceedings of the International Conference on Artificial Neural Networks
Document Clustering Using the 1 + 1 Dimensional Self-Organising Map

IDEAL '02 Proceedings of the Third International Conference on Intelligent Data Engineering and Automated Learning
Automatically Analyzing and Organizing Music Archives

ECDL '01 Proceedings of the 5th European Conference on Research and Advanced Technology for Digital Libraries
Business, Culture, Politics, and Sports - How to Find Your Way through a Bulk of News? On Content-Based Hierarchical Structuring and Organization of Large Document Archives

DEXA '01 Proceedings of the 12th International Conference on Database and Expert Systems Applications
Self-Organising Maps for Hierarchical Tree View Document Clustering Using Contextual Information

IDEAL '02 Proceedings of the Third International Conference on Intelligent Data Engineering and Automated Learning
Truth in the Digital Library: From Ontological to Hermeneutical Systems

ECDL '01 Proceedings of the 5th European Conference on Research and Advanced Technology for Digital Libraries
Analysis and visualization of gene expression data using self-organizing maps

Neural Networks - New developments in self-organizing maps
Integrating contextual information to enhance SOM-based text document clustering

Neural Networks - New developments in self-organizing maps
Web page clustering using a self-organizing map of user navigation patterns

Decision Support Systems - Special issue: Web data mining
Introduction to the JASIST special topic section on web retrieval and mining: a machine learning perspective

Journal of the American Society for Information Science and Technology
A Hybrid Layout Algorithm for Sub-Quadratic Multidimensional Scaling

INFOVIS '02 Proceedings of the IEEE Symposium on Information Visualization (InfoVis'02)
Self-organizing maps for interactive search in document databases

Intelligent exploration of the web
Methods for exploratory cluster analysis

Intelligent exploration of the web
On the quality of ART1 text clustering

Neural Networks - 2003 Special issue: Advances in neural networks research — IJCNN'03
Fast multidimensional scaling through sampling, springs and interpolation

Information Visualization
Automated categorization in the international patent classification

ACM SIGIR Forum
Interactive Visualization and Navigation in Large Data Collections using the Hyperbolic Space

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
A Dynamic Adaptive Self-Organising Hybrid Model for Text Clustering

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
SOM: feature extraction from patient discharge summaries

Proceedings of the 2003 ACM symposium on Applied computing
Techniques for visualizing website usage patterns with an adaptive neural network

Computing information technology
Attribute space visualization of demographic change

GIS '03 Proceedings of the 11th ACM international symposium on Advances in geographic information systems
Clustering documents in a web directory

WIDM '03 Proceedings of the 5th ACM international workshop on Web information and data management
Visualizing changes in the structure of data for exploratory feature selection

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Web cache optimization with nonlinear model using object features

Computer Networks: The International Journal of Computer and Telecommunications Networking
Evaluating Keyword Selection Methods for WEBSOM Text Archives

IEEE Transactions on Knowledge and Data Engineering
Improving domain ontologies by mining semantics from text

APCCM '04 Proceedings of the first Asian-Pacific conference on Conceptual modelling - Volume 31
Hybrid Neural Document Clustering Using Guided Self-Organization and WordNet

IEEE Intelligent Systems
A visual workspace for constructing hybrid multidimensional scaling algorithms and coordinating multiple views

Information Visualization - Special issue on coordinated and multiple views in exploratory visualization
LSISOM – A Latent Semantic Indexing Approach to Self-Organizing Maps of Document Collections

Neural Processing Letters
Constructing an associative concept space for literature-based discovery

Journal of the American Society for Information Science and Technology
Properties-based retrieval and user decision states: user control and behavior modeling

Journal of the American Society for Information Science and Technology
Marginal median SOM for document organization and retrieval

Neural Networks
Self-organized load balancing in proxy servers: algorithms and performance

Journal of Intelligent Information Systems - Special issue on web intelligence
Mining massive document collections by the WEBSOM method

Information Sciences: an International Journal - Special issue: Soft computing data mining
Expanding self-organizing map for data visualization and cluster analysis

Information Sciences: an International Journal - Special issue: Soft computing data mining
A data mining approach to modeling relationships among categories in image collection

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Intelligent web traffic mining and analysis

Journal of Network and Computer Applications - Special issue on computational intelligence on the internet
Organizing and visualizing software repositories using the growing hierarchical self-organizing map

Proceedings of the 2005 ACM symposium on Applied computing
Selforganizing classification on the Reuters news corpus

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Using syntactic analysis to increase efficiency in visualizing text collections

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Using web structure and summarisation techniques for web content mining

Information Processing and Management: an International Journal
Speeding up the Self-Organizing Feature Map Using Dynamic Subset Selection

Neural Processing Letters
An Integrated Framework for Visualized and Exploratory Pattern Discovery in Mixed Data

IEEE Transactions on Knowledge and Data Engineering
Adaptive topological tree structure for document organisation and visualisation

Neural Networks - 2004 Special issue: New developments in self-organizing systems
Adaptive dialogue systems - interaction with interact

SIGDIAL '02 Proceedings of the 3rd SIGdial workshop on Discourse and dialogue - Volume 2
Topic identification in natural language dialogues using neural networks

SIGDIAL '02 Proceedings of the 3rd SIGdial workshop on Discourse and dialogue - Volume 2
Intelligent patent analysis through the use of a neural network: experiment of multi-viewpoint analysis with the MultiSOM model

PATENT '03 Proceedings of the ACL-2003 workshop on Patent corpus processing - Volume 20
Knowledge map creation and maintenance for virtual communities of practice

Information Processing and Management: an International Journal
Analysis of the query logs of a web site search engine

Journal of the American Society for Information Science and Technology
Digital Content Recommender on the Internet

IEEE Intelligent Systems
Granular self-organizing map (grSOM) for structure identification

Neural Networks
Large-scale data exploration with the hierarchically growing hyperbolic SOM

Neural Networks - 2006 Special issue: Advances in self-organizing maps--WSOM'05
Fast algorithm and implementation of dissimilarity self-organizing maps

Neural Networks - 2006 Special issue: Advances in self-organizing maps--WSOM'05
Effective organization and visualization of web search results

IMSA'06 Proceedings of the 24th IASTED international conference on Internet and multimedia systems and applications
Deterministic projection by growing cell structure networks for visualization of high-dimensionality datasets

Journal of Biomedical Informatics - Special section: JAMA commentaries
An anticipation model of potential customers' purchasing behavior based on clustering analysis and association rules analysis

Expert Systems with Applications: An International Journal
Adaptive self-organized maps based on bidirectional approximate reasoning and its applications to information filtering

Knowledge-Based Systems
Building a scientific knowledge web portal: the NanoPort experience

Decision Support Systems
User modeling for personalized Web search with self-organizing map: Research Articles

Journal of the American Society for Information Science and Technology
User modeling for personalized Web search with self-organizing map: Research Articles

Journal of the American Society for Information Science and Technology
A hierarchical SOM-based intrusion detection system

Engineering Applications of Artificial Intelligence
Neural Network Based Document Clustering Using WordNet Ontologies

International Journal of Hybrid Intelligent Systems
A system for generating user's chronological interest space from web browsing history

International Journal of Knowledge-based and Intelligent Engineering Systems
Web searching in Chinese: A study of a search engine in Hong Kong

Journal of the American Society for Information Science and Technology
The learning vector quantization algorithm applied to automatic text classification tasks

Neural Networks
AASA: a Method of Automatically Acquiring Semantic Annotations

Journal of Information Science
A new approach to hierarchical clustering and structuring of data with Self-Organizing Maps

Intelligent Data Analysis
3D head model retrieval in kernel feature space using HSOM

Pattern Recognition
Toward a hybrid data mining model for customer retention

Knowledge-Based Systems
Multi-domain collaborative exploration mechanisms for query expansion in an agent-based filtering framework

Electronic Commerce Research and Applications
Taxonomy learning for semantic annotation of web services

ICCOMP'07 Proceedings of the 11th WSEAS International Conference on Computers
Making sense of it all: implementing an emerging KM technology within the organisation

International Journal of Information Technology and Management
Intelligent techniques for cigarette formula design

Mathematics and Computers in Simulation
Semantic mapping and K-means applied to hybrid SOM-based document organization system construction

Proceedings of the 2008 ACM symposium on Applied computing
Improving the performance of personal name disambiguation using web directories

Information Processing and Management: an International Journal
Mining images using clustering and data compressing techniques

International Journal of Information and Communication Technology
Inferring semantics from textual information in multimedia retrieval

Neurocomputing
A quickly trainable hybrid SOM-based document organization system

Neurocomputing
Text Visualization for Visual Text Analytics

Visual Data Mining
The Evaluation Measure of Text Clustering for the Variable Number of Clusters

ISNN '07 Proceedings of the 4th international symposium on Neural Networks: Part II--Advances in Neural Networks
Table Based Single Pass Algorithm for Clustering News Articles in NewsPage.com

ICCSA '08 Proceedings of the international conference on Computational Science and Its Applications, Part II
Entropy-based associative classification algorithm for mining manufacturing data

International Journal of Computer Integrated Manufacturing
Document classification system based on HMM word map

CSTST '08 Proceedings of the 5th international conference on Soft computing as transdisciplinary science and technology
Embedded Map Projection for Dimensionality Reduction-Based Similarity Search

SSPR & SPR '08 Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
A method for multilingual text mining and retrieval using growing hierarchical self-organizing maps

Journal of Information Science
Supervised Textual Document Classification Using Neuronal Group Learning

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-Organizing Maps

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
A neural model for unsupervised taxonomy enrichment

Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
Modular network SOM

Neural Networks
A two-stage architecture for stock price forecasting by integrating self-organizing map and support vector regression

Expert Systems with Applications: An International Journal
Classifying Amharic webnews

Information Retrieval
Interpretation of MR images using self-organizing maps and knowledge-based expert systems

Digital Signal Processing
Multi-objective genetic local search algorithm using Kohonen's neural map

Computers and Industrial Engineering
Using scatterplots to understand and improve probabilistic models for text categorization and retrieval

International Journal of Approximate Reasoning
Classification Visualization across Mapping on a Sphere

Proceedings of the 2008 conference on New Trends in Multimedia and Network Information Systems
Optimal Combination of SOM Search in Best-Matching Units and Map Neighborhood

WSOM '09 Proceedings of the 7th International Workshop on Advances in Self-Organizing Maps
A local semi-supervised Sammon algorithm for textual data visualization

Journal of Intelligent Information Systems
Creating ambient music spaces in real and virtual worlds

Multimedia Tools and Applications
Learning a Self-organizing Map Model on a Riemannian Manifold

Proceedings of the 13th IMA International Conference on Mathematics of Surfaces XIII
Nonlinear Embedded Map Projection for Dimensionality Reduction

ICIAP '09 Proceedings of the 15th International Conference on Image Analysis and Processing
A Riemannian Self-Organizing Map

ICIAP '09 Proceedings of the 15th International Conference on Image Analysis and Processing
Classifying Amharic news text using self-organizing maps

Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages
Identification of Similar Documents Using Coherent Chunks

DAARC '09 Proceedings of the 7th Discourse Anaphora and Anaphor Resolution Colloquium on Anaphora Processing and Applications
Using Web structure and summarisation techniques for Web content mining

Information Processing and Management: an International Journal
Knowledge map creation and maintenance for virtual communities of practice

Information Processing and Management: an International Journal
Document clustering using unsupervised learning method: topology-preserving map

Proceedings of the International Conference and Workshop on Emerging Trends in Technology
An efficient MDS-based topographic mapping algorithm

Neurocomputing
Visualizing asymmetric proximities with SOM and MDS models

Neurocomputing
Tree view self-organisation of web content

Neurocomputing
A novel approach for distributed application scheduling based on prediction of communication events

Future Generation Computer Systems
New usage of SOM for genetic algorithms

GECCO'03 Proceedings of the 2003 international conference on Genetic and evolutionary computation: PartI
Improved web searching through neural network based index generation

ICCS'03 Proceedings of the 2003 international conference on Computational science: PartIII
Data management by self-organizing maps

WCCI'08 Proceedings of the 2008 IEEE world conference on Computational intelligence: research frontiers
Data mining using an adaptive HONN model with hyperbolic tangent neurons

PKAW'10 Proceedings of the 11th international conference on Knowledge management and acquisition for smart systems and services
Hidden semantic concept discovery in region based image retrieval

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Knowledge mapping for rapidly evolving domains: A design science approach

Decision Support Systems
Visualizing web search results using glyphs: Design and evaluation of a flower metaphor

ACM Transactions on Management Information Systems (TMIS)
Combining stationary wavelet transform and self-organizing maps for brain MR image segmentation

Engineering Applications of Artificial Intelligence
A new-fangled FES-k-Means clustering algorithm for disease discovery and visual analytics

EURASIP Journal on Bioinformatics and Systems Biology
A visual workspace for hybrid multidimensional scaling algorithms

INFOVIS'03 Proceedings of the Ninth annual IEEE conference on Information visualization
Hybrid-patent classification based on patent-network analysis

Journal of the American Society for Information Science and Technology
Research of fast SOM clustering for text information

Expert Systems with Applications: An International Journal
Cluster ensemble in adaptive tree structured clustering

International Journal of Knowledge Engineering and Soft Data Paradigms
Using self-organizing networks for intrusion detection

NN'05 Proceedings of the 6th WSEAS international conference on Neural networks
A clustering algorithm for Chinese text based on SOM neural network and density

ISNN'05 Proceedings of the Second international conference on Advances in neural networks - Volume Part II
Visualization of depending patterns in metabonomics

ICONIP'06 Proceedings of the 13th international conference on Neural information processing - Volume Part III
Tag clouds for displaying semantics: the case of filmscripts

Information Visualization
Data organization and visualization using self-sorting map

Proceedings of Graphics Interface 2011
On wires and cables: content analysis of wikileaks using self-organising maps

WSOM'11 Proceedings of the 8th international conference on Advances in self-organizing maps
Online labelling strategies for growing neural gas

IDEAL'11 Proceedings of the 12th international conference on Intelligent data engineering and automated learning
A computational intelligence scheme for the prediction of the daily peak load

Applied Soft Computing
An improved text categorization methodology based on second and third order probabilistic feature extraction and neural network classifiers

KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I
Concept chain based text clustering

CIS'05 Proceedings of the 2005 international conference on Computational Intelligence and Security - Volume Part I
Semantic correlation network based text clustering

AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
On the chinese document clustering based on dynamical term clustering

AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Interpreting gene profiles from biomedical literature mining with self organizing maps

ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part III
Joint time-frequency and kernel principal component based SOM for machine maintenance

ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part III
Implementing a chinese character browser using a topography-preserving map

ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part III
Using the Taguchi method for effective market segmentation

Expert Systems with Applications: An International Journal
Improved ROCK for text clustering using asymmetric proximity

SOFSEM'06 Proceedings of the 32nd conference on Current Trends in Theory and Practice of Computer Science
Self-Organizing-Map-Based metamodeling for massive text data exploration

ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part I
Using self-organizing maps to support video navigation

ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
Visualization architecture based on SOM for two-class sequential data

KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
A multi-layered summarization system for multi-media archives by understanding and structuring of chinese spoken documents

ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing
Indexing and mining audiovisual data

AM'03 Proceedings of the Second international conference on Active Mining
Pattern mining across domain-specific text collections

MLDM'05 Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition
An intelligent information system for detecting web commerce transactions

AWIC'05 Proceedings of the Third international conference on Advances in Web Intelligence
Extending the SOM algorithm to visualize word relationships

IDA'05 Proceedings of the 6th international conference on Advances in Intelligent Data Analysis
An efficient MDS algorithm for the analysis of massive document collections

KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
Video navigation based on self-organizing maps

CIVR'06 Proceedings of the 5th international conference on Image and Video Retrieval
Pattern discovery from time series using growing hierarchical self-organizing map

ICONIP'06 Proceedings of the 13 international conference on Neural Information Processing - Volume Part I
Fast growing self organizing map for text clustering

ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
Discovering non-taxonomic relations from the web

IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
Categorization of large text collections: feature selection for training neural networks

IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
An overall-regional competitive self-organizing map neural network for the Euclidean traveling salesman problem

Neurocomputing
Post-retrieval search hit clustering to improve information retrieval effectiveness: Two digital forensics case studies

Decision Support Systems
A permeable expert search strategy approach to multimodal retrieval

Proceedings of the 4th Information Interaction in Context Symposium
A three-phase method for patent classification

Information Processing and Management: an International Journal
Unsupervised object discovery via self-organisation

Pattern Recognition Letters
Supporting information management in digital libraries with map-based interfaces

ECDL'07 Proceedings of the 11th European conference on Research and Advanced Technology for Digital Libraries
High-performance dynamic quantum clustering on graphics processors

Journal of Computational Physics
Essentials of the self-organizing map

Neural Networks
Accelerating text mining workloads in a MapReduce-based distributed GPU environment

Journal of Parallel and Distributed Computing
A novel self-adaptive clustering algorithm for dynamic data

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part III
Theory-Informed Design and Evaluation of an Advanced Search and Knowledge Mapping System in Nanotechnology

Journal of Management Information Systems
Probability-based text clustering algorithm by alternately repeating two operations

Journal of Information Science
Knowledge discovery in inspection reports of marine structures

Expert Systems with Applications: An International Journal
Ambient intelligence for quality of life assessment

Journal of Ambient Intelligence and Smart Environments - Ambient and Smart Component Technologies for Human Centric Computing

Quantified Score

Hi-index	0.02

Visualization

Abstract

Describes the implementation of a system that is able to organize vast document collections according to textual similarities. It is based on the self-organizing map (SOM) algorithm. As the feature vectors for the documents statistical representations of their vocabularies are used. The main goal in our work has been to scale up the SOM algorithm to be able to deal with large amounts of high-dimensional data. In a practical experiment we mapped 6840568 patent abstracts onto a 1002240-node SOM. As the feature vectors we used 500-dimensional vectors of stochastic figures obtained as random projections of weighted word histograms