Selection and information: a class-based approach to lexical relationships
Selection and information: a class-based approach to lexical relationships
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Conceptual analysis of lexical taxonomies: the case of WordNet top-level
Proceedings of the international conference on Formal Ontology in Information Systems - Volume 2001
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Class-based probability estimation using a semantic hierarchy
Computational Linguistics
An Information-Theoretic Definition of Similarity
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
An empirical study on retrieval models for different document genres: patents and newspaper articles
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Document clustering based on non-negative matrix factorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Automatic verb classification based on statistical distributions of argument structure
Computational Linguistics
Using semantic preferences to identify verbal participation in role switching alternations
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Automatic extraction of subcategorization from corpora
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
On learning more appropriate Selectional Restrictions
EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
Word clustering and disambiguation based on co-occurrence data
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Verbs semantics and lexical selection
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Word sense disambiguation using Conceptual Density
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Name disambiguation in author citations using a K-way spectral clustering method
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Structural Semantic Interconnections: A Knowledge-Based Approach to Word Sense Disambiguation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Semantic coherence scoring using an ontology
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
An unsupervised approach to prepositional phrase attachment using contextually similar words
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
A New Text Categorization Technique Using Distributional Clustering and Learning Logic
IEEE Transactions on Knowledge and Data Engineering
Evaluating WordNet-based Measures of Lexical Semantic Relatedness
Computational Linguistics
Experiments on the Automatic Induction of German Semantic Verb Classes
Computational Linguistics
A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Characterising measures of lexical distributional similarity
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Collective content selection for concept-to-text generation
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Distributional measures of concept-distance: a task-oriented evaluation
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Using information content to evaluate semantic similarity in a taxonomy
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Measuring the semantic similarity of texts
EMSEE '05 Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment
Context comparison as a minimum cost flow problem
TextGraphs-1 Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing
Measuring semantic distance using distributional profiles of concepts
Measuring semantic distance using distributional profiles of concepts
TreeBoost.MH: a boosting algorithm for multi-label hierarchical text categorization
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Random walks on text structures
CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
Name discrimination by clustering similar contexts
CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Use of Medical Subject Headings (MeSH) in Portuguese for categorizing web-based healthcare content
Journal of Biomedical Informatics
Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
Harnessing different knowledge sources to measure semantic relatedness under a uniform model
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hi-index | 0.01 |
Many NLP applications entail that texts are classified based on their semantic distance (how similar or different the texts are). For example, comparing the text of a new document to that of documents of known topics can help identify the topic of the new text. Typically, a distributional distance is used to capture the implicit semantic distance between two pieces of text. However, such approaches do not take into account the semantic relations between words. In this article, we introduce an alternative method of measuring the semantic distance between texts that integrates distributional information and ontological knowledge within a network flow formalism. We first represent each text as a collection of frequency-weighted concepts within an ontology. We then make use of a network flow method which provides an efficient way of explicitly measuring the frequency-weighted ontological distance between the concepts across two texts. We evaluate our method in a variety of NLP tasks, and find that it performs well on two of three tasks. We develop a new measure of semantic coherence that enables us to account for the performance difference across the three data sets, shedding light on the properties of a data set that lends itself well to our method.