Communications of the ACM - Special issue on parallelism
Discovery procedures for sublanguage selectional patterns: initial experiments
Computational Linguistics
Instance-Based Learning Algorithms
Machine Learning
Experience with a stack decoder-based HMM CSR and back-OFF N-gram language models
HLT '91 Proceedings of the workshop on Speech and Natural Language
Elements of information theory
Elements of information theory
Use of syntactic context to produce term association lists for text retrieval
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Experiment on linguistically-based term associations
Information Processing and Management: an International Journal
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Class-based n-gram models of natural language
Computational Linguistics
Artificial Intelligence Review - Special issue on lazy learning
Similarity-based approaches to natural language processing
Similarity-based approaches to natural language processing
Explorations in Automatic Thesaurus Discovery
Explorations in Automatic Thesaurus Discovery
An Information-Theoretic Definition of Similarity
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Advances in Neural Information Processing Systems 5, [NIPS Conference]
A stochastic parts program and noun phrase parser for unrestricted text
ANLC '88 Proceedings of the second conference on Applied natural language processing
Similarity-based methods for word sense disambiguation
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Using syntactic dependency as local context to resolve word sense ambiguity
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Memory-based learning: using similarity for smoothing
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Contextual word similarity and estimation from sparse data
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Statistical sense disambiguation with relatively small corpora using dictionary definitions
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Similarity-based estimation of word cooccurrence probabilities
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Noun classification from predicate-argument structures
ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Integrating multiple knowledge sources to disambiguate word sense: an exemplar-based approach
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
An empirical study of smoothing techniques for language modeling
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Word-sense disambiguation using statistical models of Roget's categories trained on large corpora
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Hierarchical clustering of words
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Smoothing of automatically generated selectional constraints
HLT '93 Proceedings of the workshop on Human Language Technology
An information-theoretic approach to automatic query expansion
ACM Transactions on Information Systems (TOIS)
Collocation Dictionary Optimization Using WordNetand k-Nearest Neighbor Learning
Machine Translation
The disambiguation of nominalizations
Computational Linguistics
Selection Restrictions Acquisition from Corpora
EPIA '01 Proceedings of the10th Portuguese Conference on Artificial Intelligence on Progress in Artificial Intelligence, Knowledge Extraction, Multi-agent Systems, Logic Programming and Constraint Solving
Assessment of Selection Restrictions Acquisition
SBIA '02 Proceedings of the 16th Brazilian Symposium on Artificial Intelligence: Advances in Artificial Intelligence
Self Organizing Map and Sammon Mapping for Asymmetric Proximities
ICANN '01 Proceedings of the International Conference on Artificial Neural Networks
Using the web to obtain frequencies for unseen bigrams
Computational Linguistics - Special issue on web as corpus
A classification approach to word prediction
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Determinants of adjective-noun plausibility
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Learning random walk models for inducing word dependency distributions
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Measures of distributional similarity
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Distributional similarity models: clustering vs. nearest neighbors
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Inducing a semantically annotated lexicon via EM-based clustering
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
A comparison of parsing technologies for the biomedical domain
Natural Language Engineering
A comparative evaluation of data-driven models in translation selection of machine translation
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Evaluating smoothing algorithms against plausibility judgements
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Frequency estimates for statistical word similarity measures
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Dimension-reduced estimation of word co-occurrence probability
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Co-occurrence Retrieval: A Flexible Framework for Lexical Distributional Similarity
Computational Linguistics
Estimating satisfactoriness of selectional restriction from corpus without a thesaurus
ACM Transactions on Asian Language Information Processing (TALIP)
Using the web to overcome data sparseness
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Clustering Syntactic Positions with Similar Semantic Requirements
Computational Linguistics
Evaluating WordNet-based Measures of Lexical Semantic Relatedness
Computational Linguistics
Experiments on the Automatic Induction of German Semantic Verb Classes
Computational Linguistics
Automated extraction of Tree-Adjoining Grammars from treebanks
Natural Language Engineering
An empirical study on language model adaptation
ACM Transactions on Asian Language Information Processing (TALIP)
Feature vector quality and distributional similarity
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Characterising measures of lexical distributional similarity
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Fast computation of lexical affinity models
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Acquisition of verb entailment from text
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
An information-theoretic approach to automatic evaluation of summaries
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Word Sense Disambiguation by Machine Learning Approach: A Short Survey
Fundamenta Informaticae - Contagious Creativity - In Honor of the 80th Birthday of Professor Solomon Marcus
Discovery of event entailment knowledge from text corpora
Computer Speech and Language
Tagging over time: real-world image annotation by lightweight meta-learning
Proceedings of the 15th international conference on Multimedia
Finding translations for low-frequency words in comparable corpora
Machine Translation
Similarity based smoothing in language modeling
Acta Cybernetica
Applications of corpus-based semantic similarity and word segmentation to database schema matching
The VLDB Journal — The International Journal on Very Large Data Bases
Learning semantic relatedness from term discrimination information
Expert Systems with Applications: An International Journal
Extracting Dependency Trees from Sanskrit Texts
Proceedings of the 3rd International Symposium on Sanskrit Computational Linguistics
Clustering Narrow-Domain Short Texts by Using the Kullback-Leibler Distance
CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
Semantic classification with distributional kernels
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Discriminative learning of selectional preference from unlabeled text
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Bootstrapping distributional feature vector quality
Computational Linguistics
Wikipedia-based semantic interpretation for natural language processing
Journal of Artificial Intelligence Research
Sequence prediction exploiting similarity information
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Computing semantic relatedness using Wikipedia-based explicit semantic analysis
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Graph-based clustering for semantic classification of onomatopoetic words
TextGraphs-3 Proceedings of the 3rd Textgraphs Workshop on Graph-Based Algorithms for Natural Language Processing
An extensive empirical study of collocation extraction methods
ACLstudent '05 Proceedings of the ACL Student Research Workshop
UMSLLS '09 Proceedings of the Workshop on Unsupervised and Minimally Supervised Learning of Lexical Semantics
Weakly supervised techniques for domain-independent sentiment classification
Proceedings of the 1st international CIKM workshop on Topic-sentiment analysis for mass opinion
Strictly lexical dependency parsing
Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Discriminative training of clustering functions: theory and experiments with entity identification
CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Classifying Japanese polysemous verbs based on fuzzy C-means clustering
TextGraphs-4 Proceedings of the 2009 Workshop on Graph-based Methods for Natural Language Processing
Using Kullback-Leibler distance for text categorization
ECIR'03 Proceedings of the 25th European conference on IR research
A mathematical model for context and word-meaning
CONTEXT'03 Proceedings of the 4th international and interdisciplinary conference on Modeling and using context
Selection restrictions acquisition for parsing improvement
INAP'01 Proceedings of the Applications of prolog 14th international conference on Web knowledge management and decision support
Similarity computation of low-frequency Chinese words
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
Finding similar defects using synonymous identifier retrieval
Proceedings of the 4th International Workshop on Software Clones
Discrete visual features modeling via leave-one-out likelihood estimation and applications
Journal of Visual Communication and Image Representation
A Bayesian method for robust estimation of distributional similarities
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
A latent dirichlet allocation method for selectional preferences
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Improving the use of pseudo-words for evaluating selectional preferences
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
From frequency to meaning: vector space models of semantics
Journal of Artificial Intelligence Research
Directional distributional similarity for lexical inference
Natural Language Engineering
A flexible, corpus-driven model of regular and inverse selectional preferences
Computational Linguistics
A word at a time: computing word relatedness using temporal semantic analysis
Proceedings of the 20th international conference on World wide web
Polysemous verb classification using subcategorization acquisition and graph-based clustering
LTC'09 Proceedings of the 4th conference on Human language technology: challenges for computer science and linguistics
Semantic relations in bilingual lexicons
ACM Transactions on Speech and Language Processing (TSLP)
Computational Linguistics
Text classification using small number of features
MLDM'05 Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition
A similarity-based approach to data sparseness problem of chinese language modeling
MICAI'05 Proceedings of the 4th Mexican international conference on Advances in Artificial Intelligence
Corpus-based analysis of japanese relative clause constructions
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
An empirical study on language model adaptation using a metric of domain similarity
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Large-scale learning of word relatedness with constraints
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Measuring the dynamic relatedness between chinese entities orienting to news corpus
MLDM'12 Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition
Word Sense Disambiguation by Machine Learning Approach: A Short Survey
Fundamenta Informaticae - Contagious Creativity - In Honor of the 80th Birthday of Professor Solomon Marcus
Open language learning for information extraction
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
A multisource context-dependent semantic distance between concepts
DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
Supervised learning of semantic relatedness
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Distributional phrasal paraphrase generation for statistical machine translation
ACM Transactions on Intelligent Systems and Technology (TIST) - Special Sections on Paraphrasing; Intelligent Systems for Socially Aware Computing; Social Computing, Behavioral-Cultural Modeling, and Prediction
Recall oriented search on the web using semantic annotations
Proceedings of the sixth international workshop on Exploiting semantic annotations in information retrieval
Predicting part-of-speech tags and morpho-syntactic relations using similarity-based technique
SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing
Hi-index | 0.00 |
In many applications of natural language processing (NLP) itis necessary to determine the likelihood of a given word combination.For example, a speech recognizer may need to determine which of thetwo word combinations “eat a peach” and ”eat a beach” is morelikely. Statistical NLP methods determine the likelihood of a wordcombination from its frequency in a training corpus. However, thenature of language is such that many word combinations are infrequentand do not occur in any given corpus. In this work we propose amethod for estimating the probability of such previously unseen wordcombinations using available information on “most similar” words.We describe probabilistic word association models based ondistributional word similarity, and apply them to two tasks, languagemodeling and pseudo-word disambiguation. In the language modelingtask, a similarity-based model is used to improve probabilityestimates for unseen bigrams in a back-off language model. Thesimilarity-based method yields a 20% perplexity improvement in theprediction of unseen bigrams and statistically significant reductionsin speech-recognition error.We also compare four similarity-based estimation methods againstback-off and maximum-likelihood estimation methods on a pseudo-wordsense disambiguation task in which we controlled for both unigram andbigram frequency to avoid giving too much weight to easy-to-disambiguate high-frequency configurations. The similarity-based methods perform up to 40% better on this particular task.