Distributional clustering of English words

Authors:
Fernando Pereira;Naftali Tishby;Lillian Lee
Affiliations:
AT&T Bell Laboratories, Murray Hill, NJ;Hebrew University, Jerusalem, Israel;Cornell University, Ithaca, NY
Venue:
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Year:
1993

Citing 5
Cited 302

Elements of information theory

Elements of information theory
A stochastic parts program and noun phrase parser for unrestricted text

ANLC '88 Proceedings of the second conference on Applied natural language processing
Contextual word similarity and estimation from sparse data

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Noun classification from predicate-argument structures

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Stochastic lexicalized tree-adjoining grammars

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2

Training and scaling preference functions for disambiguation

Computational Linguistics
Automatic thesaurus construction using Bayesian networks

CIKM '95 Proceedings of the fourth international conference on Information and knowledge management
On-line learning of binary and n-ary relations over multi-dimensional clusters

COLT '95 Proceedings of the eighth annual conference on Computational learning theory
Improving statistical language model performance with automatically generated word hierarchies

Computational Linguistics
Distributional clustering of words for text classification

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Applications of linear algebra in information retrieval and hypertext analysis

PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Similarity-Based Models of Word Cooccurrence Probabilities

Machine Learning - Special issue on natural language learning
Towards automated synthesis of data mining programs

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Document clustering using word clusters via the information bottleneck method

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Corpus-based learning of semantic relations by the ILP system, Asium

Learning language in logic
On feature distributional clustering for text categorization

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Inferring the environment in a text-to-scene conversion system

Proceedings of the 1st international conference on Knowledge capture
On the quantification of e-business capacity

Proceedings of the 3rd ACM conference on Electronic Commerce
DIRT @SBT@discovery of inference rules from text

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Mining the web for answers to natural language questions

Proceedings of the tenth international conference on Information and knowledge management
Unsupervised document classification using sequential information maximization

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Document clustering with cluster refinement and model selection capabilities

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
A New Nonparametric Pairwise Clustering Algorithm Based on Iterative Estimation of Distance Profiles

Machine Learning - Special issue: Unsupervised learning
Exploiting Hierarchy in Text Categorization

Information Retrieval
The Balancing Act, Judith L. Klavans and Philip Resnik

Journal of Logic, Language and Information
Unsupervised Learning by Probabilistic Latent Semantic Analysis

Machine Learning
Clustering based on conditional distributions in an auxiliary space

Neural Computation
Near-synonymy and lexical choice

Computational Linguistics
Class-based probability estimation using a semantic hierarchy

Computational Linguistics
Automatic labeling of semantic roles

Computational Linguistics
The disambiguation of nominalizations

Computational Linguistics
A Hierarchical Model for Clustering and Categorising Documents

Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval
Parametric Distributional Clustering for Image Segmentation

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part III
Knowledge Acquisition of Predicate Argument Structures from Technical Texts Using Machine Learning: The System ASIUM

EKAW '99 Proceedings of the 11th European Workshop on Knowledge Acquisition, Modeling and Management
SVETLAN' or How to Classify Words Using Their Context

EKAW '00 Proceedings of the 12th European Workshop on Knowledge Acquisition, Modeling and Management
Selection Restrictions Acquisition from Corpora

EPIA '01 Proceedings of the10th Portuguese Conference on Artificial Intelligence on Progress in Artificial Intelligence, Knowledge Extraction, Multi-agent Systems, Logic Programming and Constraint Solving
Assessment of Selection Restrictions Acquisition

SBIA '02 Proceedings of the 16th Brazilian Symposium on Artificial Intelligence: Advances in Artificial Intelligence
Clustering Gene Expression Data by Mutual Information with Gene Function

ICANN '01 Proceedings of the International Conference on Artificial Neural Networks
Clustering by Similarity in an Auxiliary Space

IDEAL '00 Proceedings of the Second International Conference on Intelligent Data Engineering and Automated Learning, Data Mining, Financial Engineering, and Intelligent Agents
Extraction of Word Senses from Human Factors in Knowledge Discovery

DS '02 Proceedings of the 5th International Conference on Discovery Science
Enhanced word clustering for hierarchical text classification

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining the peanut gallery: opinion extraction and semantic classification of product reviews

WWW '03 Proceedings of the 12th international conference on World Wide Web
Integrating contextual information to enhance SOM-based text document clustering

Neural Networks - New developments in self-organizing maps
Identifying semantic relations in text

Exploring artificial intelligence in the new millennium
Building a web thesaurus from web link structure

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Coupled clustering: a method for detecting structural correspondence

The Journal of Machine Learning Research
A neural probabilistic language model

The Journal of Machine Learning Research
An introduction to variable and feature selection

The Journal of Machine Learning Research
Distributional word clusters vs. words for text categorization

The Journal of Machine Learning Research
A divisive information theoretic feature clustering algorithm for text classification

The Journal of Machine Learning Research
Semantic Log Analysis Based on a User Query Behavior Model

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Latent semantic models for collaborative filtering

ACM Transactions on Information Systems (TOIS)
Task adaptation in stochastic language model for Chinese homophone disambiguation

ACM Transactions on Asian Language Information Processing (TALIP)
Using the web to obtain frequencies for unseen bigrams

Computational Linguistics - Special issue on web as corpus
Learning ontologies from natural language texts

International Journal of Human-Computer Studies
Learning methods to combine linguistic indicators: improving aspectual classification and revealing linguistic insights

Computational Linguistics
Introduction to the special issue on word sense disambiguation: the state of the art

Computational Linguistics - Special issue on word sense disambiguation
Similarity-based word sense disambiguation

Computational Linguistics - Special issue on word sense disambiguation
Automatic word sense discrimination

Computational Linguistics - Special issue on word sense disambiguation
Generalizing case frames using a thesaurus and the MDL principle

Computational Linguistics
Word clustering and disambiguation based on co-occurrence data

Natural Language Engineering
Discovery of inference rules for question-answering

Natural Language Engineering
Verb sense disambiguation based on dual distributional similarity

Natural Language Engineering
Topic-based mixture language modelling

Natural Language Engineering
Finding a domain-appropriate sense inventory for semantically tagging a corpus

Natural Language Engineering
Unsupervised discovery of scenario-level patterns for Information Extraction

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Improving language models by clustering training sentences

ANLC '94 Proceedings of the fourth conference on Applied natural language processing
Automatic selection of class labels from a thesaurus for an effective semantic tagging of corpora

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Collocation map for overcoming data sparseness

EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
Grouping words using statistical context

EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
Automatic verb classification using distributions of grammatical features

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Document classification using a finite mixture model

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Similarity-based methods for word sense disambiguation

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Predicting the semantic orientation of adjectives

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Cross-language headline generation for Hindi

ACM Transactions on Asian Language Information Processing (TALIP)
Word clustering and disambiguation based on co-occurrence data

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Contextual word similarity and estimation from sparse data

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Towards the automatic identification of adjectival scales: clustering adjectives according to meaning

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Statistical sense disambiguation with relatively small corpora using dictionary definitions

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Similarity-based estimation of word cooccurrence probabilities

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
A hierarchical monothetic document clustering algorithm for summarization and browsing search results

Proceedings of the 13th international conference on World Wide Web
Principle of Learning Metrics for Exploratory Data Analysis

Journal of VLSI Signal Processing Systems
Learning word clusters from data types

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Generalizing automatically generated selectional patterns

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Co-occurrence vectors from corpora vs. distance vectors from dictionaries

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
A "not-so-shallow" parser for collocational analysis

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Clustering words with the MDL principle

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Experiments in automated lexicon building for text searching

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Redefining similarity in a thesaurus by using corpora

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Two supervised learning approaches for name disambiguation in author citations

Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
A probabilistic framework for semi-supervised clustering

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
The state of the art in ontology learning: a framework for comparison

The Knowledge Engineering Review
A practical web-based approach to generating topic hierarchy for text segments

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Distributional term representations: an experimental comparison

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Distributional similarity models: clustering vs. nearest neighbors

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Inducing a semantically annotated lexicon via EM-based clustering

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Corpus-based linguistic indicators for aspectual classification

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Automatic construction of a hypernym-labeled noun hierarchy from text

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Ordering among premodifiers

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Word classification and hierarchy using co-occurrence word information

Information Processing and Management: an International Journal
Learning Hidden Variable Networks: The Information Bottleneck Approach

The Journal of Machine Learning Research
Name disambiguation in author citations using a K-way spectral clustering method

Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Rule-based word clustering for document metadata extraction

Proceedings of the 2005 ACM symposium on Applied computing
Clustering adjectives for class acquisition

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
Evaluating and combining approaches to selectional preference acquisition

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Essential Latent Knowledge for Protein-Protein Interactions: Analysis by an Unsupervised Learning Approach

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Taxonomy learning: factoring the structure of a taxonomy into a semantic classification decision

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Probabilistic models of verb-argument structure

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Extracting paraphrases from a parallel corpus

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Evaluating smoothing algorithms against plausibility judgements

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Building semantic perceptron net for topic spotting

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Exploring asymmetric clustering for statistical language modeling

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Scaling context space

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Class-based probability estimation using a semantic hierarchy

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Learning to paraphrase: an unsupervised approach using multiple-sequence alignment

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Word sense acquisition from bilingual comparable corpora

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
The order of prenominal adjectives in natural language generation

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Taxonomy generation for text segments: A practical web-based approach

ACM Transactions on Information Systems (TOIS)
Multi-way distributional clustering via pairwise interactions

ICML '05 Proceedings of the 22nd international conference on Machine learning
Co-occurrence Retrieval: A Flexible Framework for Lexical Distributional Similarity

Computational Linguistics
Estimating satisfactoriness of selectional restriction from corpus without a thesaurus

ACM Transactions on Asian Language Information Processing (TALIP)
Enhancing the control and efficiency of the covering process [logic verification]

HLDVT '03 Proceedings of the Eighth IEEE International Workshop on High-Level Design Validation and Test Workshop
Experiments on unsupervised learning for extracting relevant fragments from spoken dialog corpus

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Variant transduction: a method for rapid development of interactive spoken interfaces

SIGDIAL '01 Proceedings of the Second SIGdial Workshop on Discourse and Dialogue - Volume 16
A transformational-based learner for dependency grammars in discharge summaries

BioMed '02 Proceedings of the ACL-02 workshop on Natural language processing in the biomedical domain - Volume 3
Image Segmentation by Networks of Spiking Neurons

Neural Computation
Using co-composition for acquiring syntactic and semantic subcategorisation

ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
Improvements in automatic thesaurus extraction

ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
Speech translation performance of statistical dependency transduction and semantic similarity transduction

S2S '02 Proceedings of the ACL-02 workshop on Speech-to-speech translation: algorithms and systems - Volume 7
Ensemble methods for automatic thesaurus extraction

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Fine-grained proper noun ontologies for question answering

SEMANET '02 Proceedings of the 2002 workshop on Building and using semantic networks - Volume 11
Cross-dataset clustering: revealing corresponding themes across multiple corpora

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Extracting structural paraphrases from aligned monolingual corpora

PARAPHRASE '03 Proceedings of the second international workshop on Paraphrasing - Volume 16
Identifying events using similarity and context

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
A general framework for distributional similarity

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Towards supporting on-demand virtual remodularization using program graphs

Proceedings of the 5th international conference on Aspect-oriented software development
Exploring social annotations for the semantic web

Proceedings of the 15th international conference on World Wide Web
Learning accurate and concise naïve Bayes classifiers from attribute value taxonomies and data

Knowledge and Information Systems
A scaleable document clustering approach for large document corpora

Information Processing and Management: an International Journal
Multivariate information bottleneck

Neural Computation
Experiments on the Automatic Induction of German Semantic Verb Classes

Computational Linguistics
Annealing techniques for unsupervised statistical language learning

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Determining the specificity of terms using compositional and contextual information

ACLstudent '04 Proceedings of the ACL 2004 workshop on Student research
The distributional inclusion hypotheses and lexical entailment

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Efficient unsupervised discovery of word categories using symmetric patterns and high frequency words

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Names and similarities on the web: fact extraction in the fast lane

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A bootstrapping approach to unsupervised detection of cue phrase variants

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Feature vector quality and distributional similarity

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Term aggregation: mining synonymous expressions using personal stylistic variations

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Hidden-variable models for discriminative reranking

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Identifying semantic relations and functional properties of human verb associations

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A generalized framework for revealing analogous themes across related topics

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Taxonomic knowledge structure discovery from imagery-based data using the neural associative incremental learning (NAIL) algorithm

Information Fusion
On anonymizing query logs via token-based hashing

Proceedings of the 16th international conference on World Wide Web
Ontology learning: state of the art and open issues

Information Technology and Management
An algorithm for unsupervised topic discovery from broadcast news stories

HLT '02 Proceedings of the second international conference on Human Language Technology Research
ProbMap -- A probabilistic approach for mapping large document collections

Intelligent Data Analysis
Clustering for metric and non-metric distance measures

Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
A stopping criterion for active learning

Computer Speech and Language
Finding translations for low-frequency words in comparable corpora

Machine Translation
Active learning and logarithmic opinion pools for hpsg parse selection

Natural Language Engineering
Modeling online reviews with multi-grain topic models

Proceedings of the 17th international conference on World Wide Web
Extracting and ranking viral communities using seeds and content similarity

Proceedings of the nineteenth ACM conference on Hypertext and hypermedia
Fast nearest neighbor retrieval for bregman divergences

Proceedings of the 25th international conference on Machine learning
Learning query intent from regularized click graphs

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Methods for extracting and classifying pairs of cognates and false friends

Machine Translation
Learning decision trees with taxonomy of propositionalized attributes

Pattern Recognition
c-Means Clustering on the Multinomial Manifold

MDAI '07 Proceedings of the 4th international conference on Modeling Decisions for Artificial Intelligence
From Anomaly Reports to Cases

ICCBR '07 Proceedings of the 7th international conference on Case-Based Reasoning: Case-Based Reasoning Research and Development
Kernel-Based Grouping of Histogram Data

ECML '07 Proceedings of the 18th European conference on Machine Learning
Learning and Generalization with the Information Bottleneck

ALT '08 Proceedings of the 19th international conference on Algorithmic Learning Theory
Automatic acquisition for sensibility knowledge using co-occurrence relation

International Journal of Computer Applications in Technology
The effect of borderline examples on language learning

Journal of Experimental & Theoretical Artificial Intelligence
Clusters, language models, and ad hoc information retrieval

ACM Transactions on Information Systems (TOIS)
Methodological Review: Empirical distributional semantics: Methods and biomedical applications

Journal of Biomedical Informatics
Semantic Clustering for a Functional Text Classification Task

CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
Whole-genome prokaryotic clustering based on gene lengths

Discrete Applied Mathematics
A survey on sentiment detection of reviews

Expert Systems with Applications: An International Journal
Clustering Hungarian verbs on the basis of complementation patterns

ACL '07 Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop
Information Extraction and Semantic Annotation of Wikipedia

Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge
Named entity recognition in biomedical texts using an HMM model

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Improved large margin dependency parsing via local constraints and laplacian regularization

CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Can human verb associations help identify salient features for semantic verb classification?

CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Superior and efficient fully unsupervised pattern-based concept acquisition using an unsupervised parser

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Using hidden Markov random fields to combine distributional and pattern-based word clustering

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Translation and extension of concepts across languages

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Domain adaptation with structural correspondence learning

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Better informed training of latent syntactic features

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Graph-based word clustering using a web search engine

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Disambiguating Tags in Blogs

TSD '09 Proceedings of the 12th International Conference on Text, Speech and Dialogue
Change (Detection) You Can Believe in: Finding Distributional Shifts in Data Streams

IDA '09 Proceedings of the 8th International Symposium on Intelligent Data Analysis: Advances in Intelligent Data Analysis VIII
Minimum Free Energy Principle for Constraint-Based Learning Bayesian Networks

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Bootstrapping distributional feature vector quality

Computational Linguistics
Hierarchical dirichlet trees for information retrieval

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Unsupervised concept discovery in Hebrew using simple unsupervised word prefix segmentation for Hebrew and Arabic

Semitic '09 Proceedings of the EACL 2009 Workshop on Computational Approaches to Semitic Languages
Learning concept hierarchies from text corpora using formal concept analysis

Journal of Artificial Intelligence Research
Unsupervised methods for determining object and relation synonyms on the web

Journal of Artificial Intelligence Research
The cluster-abstraction model: unsupervised learning of topic hierarchies from text data

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Automatic fine-grained semantic classification for domain adaptation

STEP '08 Proceedings of the 2008 Conference on Semantics in Text Processing
Combining open-source with research to re-engineer a hands-on introductory NLP course

TeachCL '08 Proceedings of the Third Workshop on Issues in Teaching Computational Linguistics
Affinity measures based on the graph Laplacian

TextGraphs-3 Proceedings of the 3rd Textgraphs Workshop on Graph-Based Algorithms for Natural Language Processing
Clustering heterogeneous data using clustering by compression

ICCOMP'09 Proceedings of the WSEAES 13th international conference on Computers
Paraphrastic grammars

TextMean '04 Proceedings of the 2nd Workshop on Text Meaning and Interpretation
Identifying synonyms among distributionally similar words

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
A concept-relationship acquisition and inference approach for hierarchical taxonomy construction from tags

Information Processing and Management: an International Journal
Cross-lingual predicate cluster acquisition to improve bilingual event extraction by inductive learning

UMSLLS '09 Proceedings of the Workshop on Unsupervised and Minimally Supervised Learning of Lexical Semantics
A two-stage method for active learning of statistical grammars

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Automatic thesaurus construction based on grammatical relations

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Context sensitive synonym discovery for web search queries

Proceedings of the 18th ACM conference on Information and knowledge management
Strictly lexical dependency parsing

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Similarity search on Bregman divergence: towards non-metric indexing

Proceedings of the VLDB Endowment
A metric-based framework for automatic taxonomy induction

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Efficient Text Classification Using Term Projection

AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Enhancement of lexical concepts using cross-lingual web mining

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
A Kernel-based feature weighting for text classification

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
A non-negative tensor factorization model for selectional preference induction

GEMS '09 Proceedings of the Workshop on Geometrical Models of Natural Language Semantics
Classifying Japanese polysemous verbs based on fuzzy C-means clustering

TextGraphs-4 Proceedings of the 2009 Workshop on Graph-based Methods for Natural Language Processing
A new method for clustering heterogeneous data: clustering by compression

WSEAS Transactions on Computers
Query reformulation using anchor text

Proceedings of the third ACM international conference on Web search and data mining
A graph-theoretic framework for semantic distance

Computational Linguistics
Finding Related Search Engine Queries by Web Community Based Query Enrichment

World Wide Web
Discriminative clustering

Neurocomputing
Supervised feature selection by clustering using conditional mutual information-based distances

Pattern Recognition
Neural based approach to keyword extraction from documents

ICCSA'03 Proceedings of the 2003 international conference on Computational science and its applications: PartI
Building clusters of related words: an unsupervised approach

PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
Automatic taxonomy generation: issues and possibilities

IFSA'03 Proceedings of the 10th international fuzzy systems association World Congress conference on Fuzzy sets and systems
Selection restrictions acquisition for parsing improvement

INAP'01 Proceedings of the Applications of prolog 14th international conference on Web knowledge management and decision support
Risk context effects in inductive reasoning: an experimental and computational modeling study

CONTEXT'07 Proceedings of the 6th international and interdisciplinary conference on Modeling and using context
Inducing classes of terms from text

TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
Learning with click graph for query intent classification

ACM Transactions on Information Systems (TOIS)
A clustering scheme for large high-dimensional document datasets

ISICA'07 Proceedings of the 2nd international conference on Advances in computation and intelligence
An unsupervised model for exploring hierarchical semantics from social annotations

ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
Construction of a probabilistic hierarchical structure based on a Japanese corpus and a Japanese thesaurus

LKR'08 Proceedings of the 3rd international conference on Large-scale knowledge resources: construction and application
A computational model of risk-context-dependent inductive reasoning based on a support vector machine

LKR'08 Proceedings of the 3rd international conference on Large-scale knowledge resources: construction and application
Learning and generalization with the information bottleneck

Theoretical Computer Science
Clustering for metric and nonmetric distance measures

ACM Transactions on Algorithms (TALG)
Universal multi-dimensional scaling

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
A composite kernel for named entity recognition

Pattern Recognition Letters
Multi-prototype vector-space models of word meaning

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Contextual information improves OOV detection in speech

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Word representations: a simple and general method for semi-supervised learning

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Improving the use of pseudo-words for evaluating selectional preferences

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
HMMs, GRs, and n-grams as lexical substitution techniques: are they portable to other languages?

MCTLLL '09 Proceedings of the Workshop on Natural Language Processing Methods and Corpora in Translation, Lexicography, and Language Learning
Long distance bigram models applied to word clustering

Pattern Recognition
From frequency to meaning: vector space models of semantics

Journal of Artificial Intelligence Research
Paraphrasing invariance coefficient: measuring para-query invariance of search engines

Proceedings of the 3rd International Semantic Search Workshop
Statistical parsing with a context-free grammar and word statistics

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
A mixture model with sharing for lexical semantics

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Metaphor identification using verb and noun clustering

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Grouping product features using semi-supervised learning with soft-constraints

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
A novel metrics based on information bottleneck principle for face retrieval

PCM'10 Proceedings of the 11th Pacific Rim conference on Advances in multimedia information processing: Part I
Cause identification from aviation safety incident reports via weakly supervised semantic lexicon construction

Journal of Artificial Intelligence Research
Learning word meanings by instruction

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Distributional lexical semantics: Toward uniform representation paradigms for advanced acquisition and processing tasks

Natural Language Engineering
A non-negative tensor factorization model for selectional preference induction

Natural Language Engineering
Clustering product features for opinion mining

Proceedings of the fourth ACM international conference on Web search and data mining
Search with synonyms: problems and solutions

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
A flexible, corpus-driven model of regular and inverse selectional preferences

Computational Linguistics
Cluster based symbolic representation and feature selection for text classification

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
Generating phrasal and sentential paraphrases: A survey of data-driven methods

Computational Linguistics
Dissimilarity based feature selection for text classification: a cluster based approach

Proceedings of the International Conference & Workshop on Emerging Trends in Technology
Polysemous verb classification using subcategorization acquisition and graph-based clustering

LTC'09 Proceedings of the 4th conference on Human language technology: challenges for computer science and linguistics
Propositionalized attribute taxonomies from data for data-driven construction of concise classifiers

Expert Systems with Applications: An International Journal
Language models as representations for weakly-supervised NLP tasks

CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
A graph model for mutual information based clustering

Journal of Intelligent Information Systems
Efficient food retrieval techniques considering relative frequencies of food related words

ICHIT'11 Proceedings of the 5th international conference on Convergence and hybrid information technology
"Nut case: what does it mean?": understanding semantic relationship between nouns in noun compounds through paraphrasing and ranking the paraphrases

Proceedings of the 1st international workshop on Search and mining entity-relationship data
Probabilistic latent semantic analysis

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Multivariate information bottleneck

UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
A comparative study on chinese word clustering

ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
Distribution based stemmer refinement

PReMI'05 Proceedings of the First international conference on Pattern Recognition and Machine Intelligence
A nearest-neighbor method for resolving PP-Attachment ambiguity

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Detection of incorrect case assignments in paraphrase generation

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Paraphrastic sentence compression with a character-based metric: tightening without deletion

MTTG '11 Proceedings of the Workshop on Monolingual Text-To-Text Generation
A comparative study on feature reduction approaches in Hindi and Bengali named entity recognition

Knowledge-Based Systems
Traffic models for community-based ranking and navigation

WINE'05 Proceedings of the First international conference on Internet and Network Economics
Making senses: bootstrapping sense-tagged lists of semantically-related words

CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
A neural network model of metaphor understanding with dynamic interaction based on a statistical language analysis

ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
Extracting semantic taxonomies of nouns from a korean MRD using a small bootstrapping thesaurus and a machine learning approach

NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Improving text categorization using domain knowledge

NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
A supervised clustering method for text classification

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Multinomial event model based abstraction for sequence and text classification

SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation
Probabilistic models of similarity in syntactic context

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Unsupervised dependency parsing without gold part-of-speech tags

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Acquiring synonyms from monolingual comparable texts

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Finding taxonomical relation from an MRD for thesaurus extension

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
A collaborative filtering similarity measure based on singularities

Information Processing and Management: an International Journal
A web-based novel term similarity framework for ontology learning

ODBASE'06/OTM'06 Proceedings of the 2006 Confederated international conference on On the Move to Meaningful Internet Systems: CoopIS, DOA, GADA, and ODBASE - Volume Part I
Automatic word clustering for text categorization using global information

AIRS'04 Proceedings of the 2004 international conference on Asian Information Retrieval Technology
Unsupervised feature selection for text data

ECCBR'06 Proceedings of the 8th European conference on Advances in Case-Based Reasoning
Unsupervised word categorization using self-organizing maps and automatically extracted morphs

IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
Discrete component analysis

SLSFS'05 Proceedings of the 2005 international conference on Subspace, Latent Structure and Feature Selection
Computational models of inductive reasoning using a statistical analysis of a Japanese corpus

Cognitive Systems Research
Learning semantics and selectional preference of adjective-noun pairs

SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Regular polysemy: a distributional model

SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Rediscovering ACL discoveries through the lens of ACL anthology network citing sentences

ACL '12 Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries
Structuring e-commerce inventory

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Investigating the semantics of frame elements

EKAW'12 Proceedings of the 18th international conference on Knowledge Engineering and Knowledge Management
An efficient approach to suggesting topically related web queries using hidden topic model

World Wide Web
Statistical metaphor processing

Computational Linguistics
QUBiC: An adaptive approach to query-based recommendation

Journal of Intelligent Information Systems
Predicting part-of-speech tags and morpho-syntactic relations using similarity-based technique

SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing
A Graph Analytical Approach for Topic Detection

ACM Transactions on Internet Technology (TOIT)
Control-flow integrity principles, implementations, and applications

ACM Transactions on Information and System Security (TISSEC)
Mutual information evaluation: A way to predict the performance of feature weighting on clustering

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe and evaluate experimentally a method for clustering words according to their distribution in particular syntactic contexts. Words are represented by the relative frequency distributions of contexts in which they appear, and relative entropy between those distributions is used as the similarity measure for clustering. Clusters are represented by average context distributions derived from the given words according to their probabilities of cluster membership. In many cases, the clusters can be thought of as encoding coarse sense distinctions. Deterministic annealing is used to find lowest distortion sets of clusters: as the annealing parameter increases, existing clusters become unstable and subdivide, yielding a hierarchical "soft" clustering of the data. Clusters are used as the basis for class models of word coocurrence, and the models evaluated with respect to held-out test data.