Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
MARSYAS: a framework for audio analysis
Organised Sound
Word association norms, mutual information, and lexicography
ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
A decision tree of bigrams is an accurate predictor of word sense
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Acquiring collocations for lexical choice between near-synonyms
ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
Significant lexical relationships
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Machine learning with lexical features: the Duluth approach to Senseval-2
SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems
Beyond lexical units: enriching wordnets with phrasets
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
Building and Using a Lexical Knowledge Base of Near-Synonym Differences
Computational Linguistics
Towards applying text mining and natural language processing for biomedical ontology acquisition
TMBIO '06 Proceedings of the 1st international workshop on Text mining in bioinformatics
A bio-inspired approach for multi-word expression extraction
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Whose thumb is it anyway?: classifying author personality from weblog text
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Automatic labeling of multinomial topic models
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Recommending questions using the mdl-based tree cut model
Proceedings of the 17th international conference on World Wide Web
XML-aided phrase indexing for hypertext documents
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Definitions in Court Decisions --Automatic Extraction and Ontology Acquisition
Proceedings of the 2009 conference on Law, Ontologies and the Semantic Web: Channelling the Legal Information Flood
First Steps Towards the Automatic Construction of Argument-Diagrams from Real Discussions
Proceedings of the 2006 conference on Computational Models of Argument: Proceedings of COMMA 2006
Multi-word term extraction for Bulgarian
ACL '07 Proceedings of the Workshop on Balto-Slavonic Natural Language Processing: Information Extraction and Enabling Technologies
Web Search Clustering and Labeling with Hidden Topics
ACM Transactions on Asian Language Information Processing (TALIP)
Determining the syntactic structure of medical terms in clinical notes
BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
An end-to-end supervised target-word sense disambiguation system
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Esfinge: a question answering system in the web using the web
EACL '06 Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Posters & Demonstrations
Feature subsumption for opinion analysis
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
SenseClusters: finding clusters that represent word senses
HLT-NAACL--Demonstrations '04 Demonstration Papers at HLT-NAACL 2004
Automatic extraction of definitions from German court decisions
IEBeyondDoc '06 Proceedings of the Workshop on Information Extraction Beyond The Document
Learning domain-specific information extraction patterns from the Web
IEBeyondDoc '06 Proceedings of the Workshop on Information Extraction Beyond The Document
An empirical study of corpus-based response automation methods for an e-mail-based help-desk domain
Computational Linguistics
Statistically-driven alignment-based multiword expression identification for technical domains
MWE '09 Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications
Issues on quality assessment of SNOMED CT® subsets: term validation and term extraction
WBIE '09 Proceedings of the Workshop on Biomedical Information Extraction
Scientific authoring support: a tool to navigate in typed citation graphs
CL&W '10 Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics and Writing: Writing Processes and Authoring Aids
Shedding (a thousand points of) light on biased language
CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Experts' retrieval with multiword-enhanced author topic model
SS '10 Proceedings of the NAACL HLT 2010 Workshop on Semantic Search
The TermiNet project: an overview
YIWCALA '10 Proceedings of the NAACL HLT 2010 Young Investigators Workshop on Computational Approaches to Languages of the Americas
Extracting and classifying Urdu multiword expressions
HLT-SS '11 Proceedings of the ACL 2011 Student Session
Automatic labelling of topic models
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Automatic extraction of NV expressions in Basque: basic issues on cooccurrence techniques
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
jMWE: a Java toolkit for detecting multi-word expressions
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
Fast and flexible MWE candidate generation with the mwetoolkit
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
Identifying collocations to measure compositionality: shared task system description
DiSCo '11 Proceedings of the Workshop on Distributional Semantics and Compositionality
A New Language Model Combining Single and Compound Terms
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
20th century esfinge (sphinx) solving the riddles at CLEF 2005
CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
A statistical medical summary translation system
Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
A hybrid approach for multiword expression identification
PROPOR'10 Proceedings of the 9th international conference on Computational Processing of the Portuguese Language
CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Unsupervised learning of p NP p word combinations
CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Towards the automatic learning of idiomatic prepositional phrases
MICAI'05 Proceedings of the 4th Mexican international conference on Advances in Artificial Intelligence
The role of multi-word units in interactive information retrieval
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
First evaluation of esfinge: a question answering system for portuguese
CLEF'04 Proceedings of the 5th conference on Cross-Language Evaluation Forum: multilingual Information Access for Text, Speech and Images
PROPOR'06 Proceedings of the 7th international conference on Computational Processing of the Portuguese Language
A comparative study of information-gathering approaches for answering help-desk email inquiries
AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
Improving portuguese term extraction
PROPOR'12 Proceedings of the 10th international conference on Computational Processing of the Portuguese Language
Evaluation of clustering algorithms for word sense disambiguation
International Journal of Data Analysis Techniques and Strategies
Discovery of novel term associations in a document collection
Bisociative Knowledge Discovery
Efficient mining of correlated sequential patterns based on null hypothesis
Proceedings of the 2012 international workshop on Web-scale knowledge representation, retrieval and reasoning
Experiments for the cross language speech retrieval task at CLEF 2006
CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
Computing n-gram statistics in MapReduce
Proceedings of the 16th International Conference on Extending Database Technology
Modeling the internal variability of multiword expressions through a pattern-based method
ACM Transactions on Speech and Language Processing (TSLP) - Special issue on multiword expressions: From theory to practice and use, part 1
On collocations and topic models
ACM Transactions on Speech and Language Processing (TSLP) - Special issue on multiword expressions: From theory to practice and use, part 2
Hi-index | 0.00 |
The Ngram Statistics Package (NSP) is a flexible and easy-to-use software tool that supports the identification and analysis of Ngrams, sequences of N tokens in online text. We have designed and implemented NSP to be easy to customize to particular problems and yet remain general enough to serve a broad range of needs. This paper provides an introduction to NSP while raising some general issues in Ngram analysis, and summarizes several applications where NSP has been successfully employed. NSP is written in Perl and is freely available under the GNU Public License.