Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Learning dictionaries for information extraction by multi-level bootstrapping
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
The ecological approach to text visualization
Journal of the American Society for Information Science - Speical issue on integrating mutiple overlapping metadata standards
A vector space model for automatic indexing
Communications of the ACM
PKDD '98 Proceedings of the Second European Symposium on Principles of Data Mining and Knowledge Discovery
Evaluating Keyword Selection Methods for WEBSOM Text Archives
IEEE Transactions on Knowledge and Data Engineering
A cross-collection mixture model for comparative text mining
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Feature-rich part-of-speech tagging with a cyclic dependency network
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Enriching the knowledge sources used in a maximum entropy part-of-speech tagger
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Identification of relevant terms to support the construction of domain ontologies
HLTKM '01 Proceedings of the workshop on Human Language Technology and Knowledge Management - Volume 2001
A mixture model for contextual text mining
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Contrastive summarization: an experiment with consumer reviews
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Domain relevance on term weighting
NLDB'07 Proceedings of the 12th international conference on Applications of Natural Language to Information Systems
Interactive graph matching and visual comparison of graphs and clustered graphs
Proceedings of the International Working Conference on Advanced Visual Interfaces
A high performance centroid-based classification approach for language identification
Pattern Recognition Letters
Chinese text classification based on neural network
ISNN'13 Proceedings of the 10th international conference on Advances in Neural Networks - Volume Part I
Hi-index | 0.00 |
In large collections of documents that are divided into predefined classes, the differences and similarities of those classes are of special interest. This paper presents an approach that is able to automatically extract terms from such document collections which describe what topics discriminate a single class from the others (discriminating terms) and which topics discriminate a subset of the classes against the remaining ones (overlap terms). The importance for real world applications and the effectiveness of our approach are demonstrated by two out of practice examples. In a first application our predefined classes correspond to different scientific conferences. By extracting terms from collections of papers published on these conferences, we determine automatically the topical differences and similarities of the conferences. In our second application task we extract terms out of a collection of product reviews which show what features reviewers commented on. We get these terms by discriminating the product review class against a suitable counter-balance class. Finally, our method is evaluated comparing it to alternative approaches.