Neurolinguistic approach to natural language processing with applications to medical text analysis

Authors:
Włodzisław Duch;Paweł Matykiewicz;John Pestian
Affiliations:
Department of Informatics, Nicolaus Copernicus University, Grudzidzka 5, 87-100 Toruń, Poland and School of Computer Engineering, Nanyang Technological University, 639798 Singapore, Singapore;School of Computer Engineering, Nanyang Technological University, 639798 Singapore, Singapore and Department of Biomedical Informatics, Children's Hospital Research Foundation, Cincinnati, OH, USA;Department of Biomedical Informatics, Children's Hospital Research Foundation, Cincinnati, OH, USA
Venue:
Neural Networks
Year:
2008

Citing 13
Cited 7

Silhouettes: a graphical aid to the interpretation and validation of cluster analysis

Journal of Computational and Applied Mathematics
Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations

Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations
Subsymbolic natural language processing: an integrated model of scripts, lexicon, and memory

Subsymbolic natural language processing: an integrated model of scripts, lexicon, and memory
Application of Spreading Activation Techniques in InformationRetrieval

Artificial Intelligence Review
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Searching the Web by constrained spreading activation

Information Processing and Management: an International Journal
Semantic Networks in Artificial Intelligence

Semantic Networks in Artificial Intelligence
Text Document Categorization by Term Association

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
A Method for Calculating Term Similarity on Large Document Collections

ITCC '03 Proceedings of the International Conference on Information Technology: Computers and Communications
An Approach for Measuring Semantic Similarity between Words Using Multiple Information Sources

IEEE Transactions on Knowledge and Data Engineering
Ontology Learning and Population from Text: Algorithms, Evaluation and Applications

Ontology Learning and Population from Text: Algorithms, Evaluation and Applications
Word sense disambiguation with spreading activation networks generated from thesauri

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Semantic smoothing of document models for agglomerative clustering

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence

Representation of hypertext documents based on terms, links and text compressibility

ICONIP'10 Proceedings of the 17th international conference on Neural information processing: theory and algorithms - Volume Part I
A clustering study of a 7000 EU document inventory using MDS and SOM

Expert Systems with Applications: An International Journal
Induction of the common-sense hierarchies in lexical data

ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
Node similarities from spreading activation

Bisociative Knowledge Discovery
Interactive information retrieval algorithm for wikipedia articles

IDEAL'12 Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning
Self organizing maps for visualization of categories

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part I
Annotating words using wordnet semantic glosses

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part IV

Quantified Score

Hi-index	0.00

Visualization

Abstract

Understanding written or spoken language presumably involves spreading neural activation in the brain. This process may be approximated by spreading activation in semantic networks, providing enhanced representations that involve concepts not found directly in the text. The approximation of this process is of great practical and theoretical interest. Although activations of neural circuits involved in representation of words rapidly change in time snapshots of these activations spreading through associative networks may be captured in a vector model. Concepts of similar type activate larger clusters of neurons, priming areas in the left and right hemisphere. Analysis of recent brain imaging experiments shows the importance of the right hemisphere non-verbal clusterization. Medical ontologies enable development of a large-scale practical algorithm to re-create pathways of spreading neural activations. First concepts of specific semantic type are identified in the text, and then all related concepts of the same type are added to the text, providing expanded representations. To avoid rapid growth of the extended feature space after each step only the most useful features that increase document clusterization are retained. Short hospital discharge summaries are used to illustrate how this process works on a real, very noisy data. Expanded texts show significantly improved clustering and may be classified with much higher accuracy. Although better approximations to the spreading of neural activations may be devised a practical approach presented in this paper helps to discover pathways used by the brain to process specific concepts, and may be used in large-scale applications.