A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Document clustering using word clusters via the information bottleneck method
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Text databases & document management
Co-clustering documents and words using bipartite spectral graph partitioning
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering spatial data using random walks
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Bipartite graph partitioning and data clustering
Proceedings of the tenth international conference on Information and knowledge management
Information Retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Evaluation of hierarchical clustering algorithms for document datasets
Proceedings of the eleventh international conference on Information and knowledge management
The use of bigrams to enhance text categorization
Information Processing and Management: an International Journal
Information-theoretic co-clustering
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A study of smoothing methods for language models applied to information retrieval
ACM Transactions on Information Systems (TOIS)
Cluster-based retrieval using language models
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Learning random walk models for inducing word dependency distributions
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Algorithmic detection of semantic similarity
WWW '05 Proceedings of the 14th international conference on World Wide Web
PageRank without hyperlinks: structural re-ranking using links induced by language models
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Relevance models for topic detection and tracking
HLT '02 Proceedings of the second international conference on Human Language Technology Research
LexRank: graph-based lexical centrality as salience in text summarization
Journal of Artificial Intelligence Research
Respect my authority!: HITS without hyperlinks, utilizing cluster-based language models
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Expert Systems with Applications: An International Journal
Tracking the dynamic evolution of participant salience in a discussion
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Scientific paper summarization using citation summary networks
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
WIT: web people search disambiguation using random walks
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
A novel clustering algorithm based upon games on evolving network
Expert Systems with Applications: An International Journal
PageRank without hyperlinks: Structural reranking using links induced by language models
ACM Transactions on Information Systems (TOIS)
Utilizing inter-passage and inter-document similarities for reranking search results
ACM Transactions on Information Systems (TOIS)
A hybrid classical-quantum clustering algorithm based on quantum walks
Quantum Information Processing
Re-ranking search results using an additional retrieved list
Information Retrieval
From "identical" to "similar": fusing retrieved lists based on inter-document similarities
Journal of Artificial Intelligence Research
The opposite of smoothing: a language model approach to ranking query-specific document clusters
Journal of Artificial Intelligence Research
A multi-level matching method with hybrid similarity for document retrieval
Expert Systems with Applications: An International Journal
Revisiting centrality-as-relevance: support sets and similarity as geometric proximity
Journal of Artificial Intelligence Research
Generating extractive summaries of scientific paradigms
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
We propose a new document vector representation specifically designed for the document clustering task. Instead of the traditional term-based vectors, a document is represented as an n-dimensional vector, where n is the number of documents in the cluster. The value at each dimension of the vector is closely related to the generation probability based on the language model of the corresponding document. Inspired by the recent graph-based NLP methods, we reinforce the generation probabilities by iterating random walks on the underlying graph representation. Experiments with k-means and hierarchical clustering algorithms show significant improvements over the alternative tf·idf vector representation.