Recent trends in hierarchic document clustering: a critical review
Information Processing and Management: an International Journal
A Classification EM algorithm for clustering and two stochastic versions
Computational Statistics & Data Analysis - Special issue on optimization techniques in statistics
Foundations of statistical natural language processing
Foundations of statistical natural language processing
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Exploiting generative models in discriminative classifiers
Proceedings of the 1998 conference on Advances in neural information processing systems II
Text classification in a hierarchical mixture model for small training sets
Proceedings of the tenth international conference on Information and knowledge management
Explorations in Automatic Thesaurus Discovery
Explorations in Automatic Thesaurus Discovery
Information Retrieval
A Probabilistic Framework for the Hierarchic Organisation and Classification of Document Collections
Journal of Intelligent Information Systems
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Improving Text Classification by Shrinkage in a Hierarchy of Classes
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Statistical Models for Co-occurrence Data
Statistical Models for Co-occurrence Data
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Web usage mining based on probabilistic latent semantic analysis
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Relation between PLSA and NMF and implications
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Categorization in multiple category systems
ICML '06 Proceedings of the 23rd international conference on Machine learning
Compression-based data mining of sequential data
Data Mining and Knowledge Discovery
Boosting multi-label hierarchical text categorization
Information Retrieval
Learning Ontologies of Appropriate Size
SETN '08 Proceedings of the 5th Hellenic conference on Artificial Intelligence: Theories, Models and Applications
Probabilistic latent semantic user segmentation for behavioral targeted advertising
Proceedings of the Third International Workshop on Data Mining and Audience Intelligence for Advertising
PLSI: The True Fisher Kernel and beyond
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Learning subsumption hierarchies of ontology concepts from texts
Web Intelligence and Agent Systems
A mixture model for expert finding
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Learning aspect models with partially labeled data
Pattern Recognition Letters
Large-scale hierarchical text classification without labelled data
Proceedings of the fourth ACM international conference on Web search and data mining
A neural network for text representation
ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II
A new tangible user interface for machine learning document review
Artificial Intelligence and Law
Non-Parametric Estimation of Topic Hierarchies from Texts with Hierarchical Dirichlet Processes
The Journal of Machine Learning Research
A unified probabilistic framework for clustering correlated heterogeneous web objects
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Using probabilistic latent semantic analysis for personalized web search
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
TreeBoost.MH: a boosting algorithm for multi-label hierarchical text categorization
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Category labelling for automatic classification scheme generation
FDIA'07 Proceedings of the 1st BCS IRSG conference on Future Directions in Information Access
Hi-index | 0.00 |
We propose a new hierarchical generative model for textual data, where words may be generated by topic specific distributions at any level in the hierarchy. This model is naturally well-suited to clustering documents in preset or automatically generated hierarchies, as well as categorising new documents in an existing hierarchy. Training algorithms are derived for both cases, and illustrated on real data by clustering news stories and categorising newsgroup messages. Finally, the generative model may be used to derive a Fisher kernel expressing similarity between documents.