Exploring the similarity space
ACM SIGIR Forum
A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
An Efficient File Structure for Document Retrieval in the Automated Office Environment
IEEE Transactions on Knowledge and Data Engineering
Classification of Web Documents Using a Graph Model
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
The Journal of Machine Learning Research
LSISOM – A Latent Semantic Indexing Approach to Self-Organizing Maps of Document Collections
Neural Processing Letters
Marginal median SOM for document organization and retrieval
Neural Networks
Content-based image retrieval by using tree-structured features and multi-layer self-organizing map
Pattern Analysis & Applications
A scaleable document clustering approach for large document corpora
Information Processing and Management: an International Journal
Language model-based document clustering using random walks
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
A fuzzy clustering approach for finding similar documents using a novel similarity measure
Expert Systems with Applications: An International Journal
Web mining based on Growing Hierarchical Self-Organizing Maps: Analysis of a real citizen web portal
Expert Systems with Applications: An International Journal
Using the self organizing map for clustering of text documents
Expert Systems with Applications: An International Journal
IEEE Transactions on Fuzzy Systems
Web content management by self-organization
IEEE Transactions on Neural Networks
A coarse-to-fine framework to efficiently thwart plagiarism
Pattern Recognition
Fast growing self organizing map for text clustering
ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
Hi-index | 12.05 |
This paper presents a new document representation with vectorized multiple features including term frequency and term-connection-frequency. A document is represented by undirected and directed graph, respectively. Then terms and vectorized graph connectionists are extracted from the graphs by employing several feature extraction methods. This hybrid document feature representation more accurately reflects the underlying semantics that are difficult to achieve from the currently used term histograms, and it facilitates the matching of complex graph. In application level, we develop a document retrieval system based on self-organizing map (SOM) to speed up the retrieval process. We perform extensive experimental verification, and the results suggest that the proposed method is computationally efficient and accurate for document retrieval.