Retrieving collocations from text: Xtract

Authors:
Frank Smadja
Affiliations:
Columbia University
Venue:
Computational Linguistics - Special issue on using large corpora: I
Year:
1993

Citing 12
Cited 175

Research toward the development of a lexical knowledge base for natural language processing

SIGIR '89 Proceedings of the 12th annual international ACM SIGIR conference on Research and development in information retrieval
Modeling for text compression

ACM Computing Surveys (CSUR)
Automatic text processing: the transformation, analysis, and retrieval of information by computer

Automatic text processing: the transformation, analysis, and retrieval of information by computer
Introduction to computational lexicography for natural language processing

Computational lexicography for natural language processing
Structural ambiguity and lexical relations

HLT '90 Proceedings of the workshop on Speech and Natural Language
Poor estimates of context are worse than none

HLT '90 Proceedings of the workshop on Speech and Natural Language
Knowledge-Based Report Generation: a technique for automatically generating natural language reports from databases

SIGIR '83 Proceedings of the 6th annual international ACM SIGIR conference on Research and development in information retrieval
A stochastic parts program and noun phrase parser for unrestricted text

ANLC '88 Proceedings of the second conference on Applied natural language processing
Word association norms, mutual information, and lexicography

ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
Types in Functional Unification Grammars

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Automatically extracting and representing collocations for language generation

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
A statistical approach to language translation

COLING '88 Proceedings of the 12th conference on Computational linguistics - Volume 1

Adapting a full-text information retrieval system to the computer troubleshooting domain

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Word sense disambiguation using a second language monolingual corpus

Computational Linguistics
Translating collocations for bilingual lexicons: a statistical approach

Computational Linguistics
Termight: Coordinating Humans and Machines in Bilingual Terminology Acquisition

Machine Translation
Glossary-Based MT Engines in a Multilingual Analyst‘s Workstation Architecture

Machine Translation
Using LocalMaxs Algorithm for the Extraction of Contiguous and Non-contiguous Multiword Lexical Units

EPIA '99 Proceedings of the 9th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
Automatic Analysis of Large Text Corpora - A Contribution to Structuring WEB Communities

IICS '02 Proceedings of the Second International Workshop on Innovative Internet Computing Systems
A Very Large Database of Collocations and Semantic Links

NLDB '00 Proceedings of the 5th International Conference on Applications of Natural Language to Information Systems-Revised Papers
Knowledge Extraction from Bilingual Corpora

Information Extraction: Towards Scalable, Adaptable Systems
Document Classification Using Phrases

Proceedings of the Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Heuristics-Based Replenishment of Collocation Databases

PorTAL '02 Proceedings of the Third International Conference on Advances in Natural Language Processing
Finding Semantically Related Words in Large Corpora

TSD '01 Proceedings of the 4th International Conference on Text, Speech and Dialogue
Text Segmentation into Paragraphs Based on Local Text Cohesion

TSD '01 Proceedings of the 4th International Conference on Text, Speech and Dialogue
Collocation Discovery for Optimal Bilingual Lexicon Development

AI '00 Proceedings of the 13th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence
A Statistical Corpus-Based Term Extractor

AI '01 Proceedings of the 14th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence
Parsing and Collocations

NLP '00 Proceedings of the Second International Conference on Natural Language Processing
An information-theoretic perspective of tf—idf measures

Information Processing and Management: an International Journal
Measuring praise and criticism: Inference of semantic orientation from association

ACM Transactions on Information Systems (TOIS)
Coupled clustering: a method for detecting structural correspondence

The Journal of Machine Learning Research
Extracting the lowest-frequency words: pitfalls and possibilities

Computational Linguistics
Generalizing case frames using a thesaurus and the MDL principle

Computational Linguistics
Evaluation of automatically identified index terms for browsing electronic documents

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Termight: identifying and translating technical terminology

ANLC '94 Proceedings of the fourth conference on Applied natural language processing
An automatic scoring system for advanced placement biology essays

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Construction and visualization of key term hierarchies

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Semi-automatic acquisition of domain-specific translation lexicons

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
A workbench for finding structure in texts

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Criteria for measuring term recognition

EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
Retrieving collocations by co-occurrences and word order constraints

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Identifying syntactic role of antecedent in korean relative clause using corpus and thesaurus information

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Learning correlations between linguistic indicators and semantic constraints: reuse of context-dependent descriptions of entities

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
The computational lexical semantics of syntagmatic relations

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Terminological variation, a means of identifying research topics from texts

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Corpus statistics meet the noun compound: some empirical results

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
A quantitative evaluation of linguistic tests for the automatic prediction of semantic markedness

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Automatic recognition of verbal polysemy

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Lexical functions and machine translation

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
A "not-so-shallow" parser for collocational analysis

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Classifier assignment by corpus-based approach

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
An IBM-PC environment for Chinese corpus analysis

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Towards automatic fine-grained semantic classification of verb-noun collocations

Natural Language Engineering
Extracting nested collocations

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Identification and classification of proper nouns in Chinese texts

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Symbolic word clustering for medium-size corpora

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Learning bilingual collocations by word-level sorting

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
A statistical method for extracting uninterrupted and interrupted collocations from very large corpora

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Experiments in automated lexicon building for text searching

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
An empirical method for identifying and translating technical terminology

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Automatic extraction of semantic relations from specialized corpora

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Good bigrams

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Using mutual information to resolve query translation ambiguities and query term weighting

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Automatic identification of non-compositional phrases

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Mixed language query disambiguation

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Experiments on candidate data for collocation extraction

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
Creating a multilingual collocation dictionary from large text corpora

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
Translation Disambiguation in Mixed Language Queries

Machine Translation
Searching the Web by voice

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 2
The computation of word associations: comparing syntagmatic and paradigmatic approaches

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Unknown word extraction for Chinese documents

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Translating collocations for use in bilingual lexicons

HLT '94 Proceedings of the workshop on Human Language Technology
Corpus-based Learning of Analogies and Semantic Relations

Machine Learning
Newsmap: a knowledge map for online news

Decision Support Systems - Special issue: Collaborative work and knowledge management
Mining all maximal frequent word sequences in a set of sentences

Proceedings of the 14th ACM international conference on Information and knowledge management
Learning Subjective Language

Computational Linguistics
Knowledge extraction for identification of Chinese organization names

CLPW '00 Proceedings of the second workshop on Chinese language processing: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 12
Acquiring collocations for lexical choice between near-synonyms

ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
Word sense disambiguation in a Korean-to-Japanese MT system using neural networks

COLING-MTIA '02 Proceedings of the 2002 COLING workshop on Machine translation in Asia - Volume 16
Extracting the unextractable: a case study on verb-particles

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
A bottom-up merging algorithm for Chinese unknown word extraction

SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Learning verb-noun relations to improve parsing

SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Complex structuring of term variants for Question Answering

MWE '03 Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18
A language model approach to keyphrase extraction

MWE '03 Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18
Multiword unit hybrid extraction

MWE '03 Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18
Extracting multiword expressions with a semantic tagger

MWE '03 Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18
Practical Word-Sense Disambiguation Using Co-occurring Concept Codes

Machine Translation
Automatic expansion of domain-specific lexicons by term categorization

ACM Transactions on Speech and Language Processing (TSLP)
Text mining without document context

Information Processing and Management: an International Journal - Special issue: Informetrics
Collocation translation acquisition using monolingual corpora

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
TANGO: bilingual collocational concordancer

ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
A nonparametric method for extraction of candidate phrasal terms

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Accurate collocation extraction using a multilingual parser

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Collocation extraction based on modifiability statistics

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Multi-level bootstrapping for extracting parallel sentences from a quasi-comparable corpus

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Text mining techniques for patent analysis

Information Processing and Management: an International Journal
Learning to Generate Labels for Organizing Search Results from a Domain-Specified Corpus

WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Knowledge-free discovery of domain-specific multiword units

Proceedings of the 2008 ACM symposium on Applied computing
Discovering Compound and Proper Nouns

RSEISP '07 Proceedings of the international conference on Rough Sets and Intelligent Systems Paradigms
Text classification based on multi-word with support vector machine

Knowledge-Based Systems
Conflict ontology enrichment based on triggers

Proceedings of the 2nd international workshop on Ontologies and information systems for the semantic web
Efficient multi-word expressions extractor using suffix arrays and related structures

Proceedings of the 2nd ACM workshop on Improving non english web searching
A method for extracting knowledge from medical texts including numerical representation

International Journal of Computer Applications in Technology
A Study on Multi-word Extraction from Chinese Documents

Advanced Web and NetworkTechnologies, and Applications
Searching for Illustrative Sentences for Multiword Expressions in a Research Paper Database

ICADL 08 Proceedings of the 11th International Conference on Asian Digital Libraries: Universal and Ubiquitous Access to Information
Utilizing Semantic, Syntactic, and Question Category Information for Automated Digital Reference Services

ICADL 08 Proceedings of the 11th International Conference on Asian Digital Libraries: Universal and Ubiquitous Access to Information
A Co-occurrence Based Hierarchical Method for Clustering Web Search Results

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Unsupervised type and token identification of idiomatic expressions

Computational Linguistics
Retrieving bilingual verb-noun collocations by integrating cross-language category hierarchies

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Computer-based support for patients with limited English

EAMT '03 Proceedings of the 7th International EAMT workshop on MT and other Language Technology Tools, Improving MT through other Language Technology Tools: Resources and Tools for Building MT
Word lookup on the basis of associations: from an idea to a roadmap

ElectricDict '04 Proceedings of the Workshop on Enhancing and Using Electronic Dictionaries
Multilingual collocation extraction: issues and solutions

MLRI '06 Proceedings of the Workshop on Multilingual Language Resources and Interoperability
Representation and treatment of multiword expressions in Basque

MWE '04 Proceedings of the Workshop on Multiword Expressions: Integrating Processing
Non-contiguous word sequences for information retrieval

MWE '04 Proceedings of the Workshop on Multiword Expressions: Integrating Processing
Terminology Extraction from Log Files

DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Semantic smoothing of document models for agglomerative clustering

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Automatic extraction of idioms using graph analysis and asymmetric lexicosyntactic patterns

DeepLA '05 Proceedings of the ACL-SIGLEX Workshop on Deep Lexical Acquisition
Semantic lexicons: the cornerstone for lexical choice in natural language generation

INLG '94 Proceedings of the Seventh International Workshop on Natural Language Generation
Annotating Chinese collocations with multi information

LAW '07 Proceedings of the Linguistic Annotation Workshop
Induction of syntactic collocation patterns from generic syntactic relations

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Mining linguistic cues for query expansion: applications to drug interaction search

Proceedings of the 18th ACM conference on Information and knowledge management
Description logics for an autonomic IDS event analysis system

Computer Communications
Context-sensitive semantic smoothing using semantically relatable sequences

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
NLP Contribution to the Semantic Web: Linking the Term to the Concept

KES '09 Proceedings of the 13th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems: Part I
A re-examination of lexical association measures

MWE '09 Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications
Collocation extraction using monolingual word alignment method

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
A cohesion graph based approach for unsupervised recognition of literal and non-literal use of multiword expressions

TextGraphs-4 Proceedings of the 2009 Workshop on Graph-based Methods for Natural Language Processing
A collocation-based WSD model: RFR-SUM

IEA/AIE'07 Proceedings of the 20th international conference on Industrial, engineering, and other applications of applied intelligent systems
A comparison of co-occurrence and similarity measures as simulations of context

CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Various criteria of collocation cohesion in internet: comparison of resolving power

CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
SIGNUM: a graph algorithm for terminology extraction

CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Building a Chinese shallow parsed treebank for collocation extraction

CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Contextual advertising using keyword extraction through collocation

Proceedings of the 7th International Conference on Frontiers of Information Technology
A short text modeling method combining semantic and statistical information

Information Sciences: an International Journal
Chart mining-based lexical acquisition with precision grammars

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Task-based evaluation of multiword expressions: a pilot study in statistical machine translation

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Finding domain specific collocations and concordances on the web

MCTLLL '09 Proceedings of the Workshop on Natural Language Processing Methods and Corpora in Translation, Lexicography, and Language Learning
The linguistics of readability: the next step for word processing

CL&W '10 Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics and Writing: Writing Processes and Authoring Aids
Using collocation segmentation to augment the phrase table

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
A probabilistic topic-connection model for automatic image annotation

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Collocation extraction in Turkish texts using statistical methods

IceTAL'10 Proceedings of the 7th international conference on Advances in natural language processing
Analysis combination and Pseudo relevance feedback in conceptual language model: LIRIS participation at ImageCLEFmed

CLEF'09 Proceedings of the 10th international conference on Cross-language evaluation forum: multimedia experiments
Multiword expressions in the wild?: the mwetoolkit comes in handy

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations
Methodological Review: Natural Language Processing methods and systems for biomedical ontology learning

Journal of Biomedical Informatics
Age prediction in blogs: a study of style, content, and online behavior in pre- and post-social media generations

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Automatic extraction of NV expressions in Basque: basic issues on cooccurrence techniques

MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
A machine learning approach to relational noun mining in German

MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
An n-gram frequency database reference to handle MWE extraction in NLP applications

MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
Identification and treatment of multiword expressions applied to information retrieval

MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
The ngram statistics package (Text::NSP): a flexible tool for identifying ngrams, collocations, and word associations

MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
Automatic retrieval of parallel collocations

PReMI'11 Proceedings of the 4th international conference on Pattern recognition and machine intelligence
Two-Word Collocation Extraction Using Monolingual Word Alignment Method

ACM Transactions on Intelligent Systems and Technology (TIST)
GRASP: grammar- and syntax-based pattern-finder in CALL

IUNLPBEA '11 Proceedings of the 6th Workshop on Innovative Use of NLP for Building Educational Applications
Building a collocation net

ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
Conceptual information-based sense disambiguation

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Bilingual chunk alignment based on interactional matching and probabilistic latent semantic indexing

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Web-Based measurements of intra-collocational cohesion in oxford collocations dictionary

CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
A multi-stage chinese collocation extraction system

ICMLC'05 Proceedings of the 4th international conference on Advances in Machine Learning and Cybernetics
Dual filtering strategy for chinese term extraction

FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part II
Pattern mining across domain-specific text collections

MLDM'05 Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition
Measurements of lexico-syntactic cohesion by means of internet

MICAI'05 Proceedings of the 4th Mexican international conference on Advances in Artificial Intelligence
Multiword expression identification with tree substitution grammars: a parsing tour de force with French

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A cascaded classification approach to semantic head recognition

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Relative compositionality of multi-word expressions: a study of verb-noun (v-n) collocations

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Automatic extraction of fixed multiword expressions

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Extracting terminologically relevant collocations in the translation of chinese monograph

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Pruning terminology extracted from a specialized corpus for CV ontology acquisition

OTM'06 Proceedings of the 2006 international conference on On the Move to Meaningful Internet Systems: AWeSOMe, CAMS, COMINF, IS, KSinBIT, MIOS-CIAO, MONET - Volume Part II
Automatic ontology extraction from unstructured texts

OTM'05 Proceedings of the 2005 OTM Confederated international conference on On the Move to Meaningful Internet Systems: CoopIS, COA, and ODBASE - Volume Part II
The influence of collocation segmentation and top 10 items to keyword assignment performance

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Unsupervised identification of persian compound verbs

MICAI'11 Proceedings of the 10th Mexican international conference on Advances in Artificial Intelligence - Volume Part I
Learning to deal with the OOV problem in phrase-based MT system: [in Chinese]

ROCLING '11 Proceedings of the 23rd Conference on Computational Linguistics and Speech Processing
Distributional lexical semantics for stop lists

IRSG'08 Proceedings of the 2008 BCS-IRSG conference on Corpus Profiling
A new method to compose long unknown Chinese keywords

Journal of Information Science
Design of a hybrid high quality machine translation system

EACL 2012 Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra)
A generic framework for multiword expressions treatment: from acquisition to applications

ACL '12 Proceedings of ACL 2012 Student Research Workshop
Applying collocation segmentation to the ACL anthology reference corpus

ACL '12 Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries
Extraction of multi-word expressions from small parallel corpora

Natural Language Engineering
Automatic extraction of chinese V-N collocations

CLSW'12 Proceedings of the 13th Chinese conference on Chinese Lexical Semantics
Parsing models for identifying multiword expressions

Computational Linguistics
Juggling the Jigsaw: towards automated problem inference from network trouble tickets

nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Introduction to the special issue on multiword expressions: From theory to practice and use

ACM Transactions on Speech and Language Processing (TSLP) - Special issue on multiword expressions: From theory to practice and use, part 1
A Computer-Assisted Translation and Writing System

ACM Transactions on Asian Language Information Processing (TALIP)
Complex Terminology Extraction Model from Unstructured Web Text Based Linguistic and Statistical Knowledge

International Journal of Information Retrieval Research
Estimation of a Priori Decision Threshold for Collocations Extraction: An Empirical Study

International Journal of Information Technology and Web Engineering
Towards advanced collocation error correction in Spanish learner corpora

Language Resources and Evaluation
Activity-based topic discovery

Web Intelligence and Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Natural languages are full of collocations, recurrent combinations of words that co-occur more often than expected by chance and that correspond to arbitrary word usages. Recent work in lexicography indicates that collocations are pervasive in English; apparently, they are common in all types of writing, including both technical and nontechnical genres. Several approaches have been proposed to retrieve various types of collocations from the analysis of large samples of textual data. These techniques automatically produce large numbers of collocations along with statistical figures intended to reflect the relevance of the associations. However, none of these techniques provides functional information along with the collocation. Also, the results produced often contained improper word associations reflecting some spurious aspect of the training corpus that did not stand for true collocations.In this paper, we describe a set of techniques based on statistical methods for retrieving and identifying collocations from large textual corpora. These techniques produce a wide range of collocations and are based on some original filtering methods that allow the production of richer and higher-precision output. These techniques have been implemented and resulted in a lexicographic tool, Xtract. The techniques are described and some results are presented on a 10 million-word corpus of stock market news reports. A lexicographic evaluation of Xtract as a collocation retrieval tool has been made, and the estimated precision of Xtract is 80%.