Empirical methods for artificial intelligence
Empirical methods for artificial intelligence
Computer Methods for Mathematical Computations
Computer Methods for Mathematical Computations
Introduction to the special issue on computational linguistics using large corpora
Computational Linguistics - Special issue on using large corpora: I
Getting into information retrieval
Lectures on information retrieval
Getting into Information Retrieval
ESSIR '00 Proceedings of the Third European Summer-School on Lectures on Information Retrieval-Revised Lectures
The Journal of Machine Learning Research
Shallow parsing with conditional random fields
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Using the distribution of performance for studying statistical NLP systems and corpora
ELDS '01 Proceedings of the workshop on Evaluation for Language and Dialogue Systems - Volume 9
The Notion of Argument in Prepositional Phrase Attachment
Computational Linguistics
Data-defined kernels for parse reranking derived from probabilistic models
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
An effective two-stage model for exploiting non-local dependencies in named entity recognition
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Significance tests for the evaluation of ranking methods
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Wide-coverage deep statistical parsing using automatic dependency structure annotation
Computational Linguistics
EURASIP Journal on Bioinformatics and Systems Biology
A Joint Segmenting and Labeling Approach for Chinese Lexical Analysis
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Terminological cleansing for improved information retrieval based on ontological terms
Proceedings of the WSDM '09 Workshop on Exploiting Semantic Annotations in Information Retrieval
Porting statistical parsers with data-defined kernels
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Loss minimization in parse reranking
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A structured vector space model for word meaning in context
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Learning document-level semantic properties from free-text annotations
Journal of Artificial Intelligence Research
Lexical and structural biases for function parsing
Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Classifying relations for biomedical named entity disambiguation
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Ranking paraphrases in context
TextInfer '09 Proceedings of the 2009 Workshop on Applied Textual Inference
Cross-lingual annotation projection of semantic roles
Journal of Artificial Intelligence Research
Ontology refinement for improved information retrieval
Information Processing and Management: an International Journal
Improving the use of pseudo-words for evaluating selectional preferences
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Grammar-driven versus data-driven: which parsing system is more affected by domain shifts?
NLPLING '10 Proceedings of the 2010 Workshop on NLP and Linguistics: Finding the Common Ground
A baseline approach for detecting sentences containing uncertainty
CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
Tagging and linking web forum posts
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Unsupervised parse selection for HPSG
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Measuring distributional similarity in context
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Word sense disambiguation for event trigger word detection
DTMBIO '10 Proceedings of the ACM fourth international workshop on Data and text mining in biomedical informatics
Inductive probabilistic taxonomy learning using singular value decomposition
Natural Language Engineering
Lexical normalisation of short text messages: makn sens a #twitter
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Effective measures of domain similarity for parsing
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Joint reranking of parsing and word recognition with automatic segmentation
Computer Speech and Language
Cross-Domain Effects on Parse Selection for Precision Grammars
Research on Language and Computation
Predicting thread discourse structure over technical web forums
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Exploring supervised lda models for assigning attributes to adjective-noun phrases
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Linguistic redundancy in Twitter
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Latent vector weighting for word meaning in context
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Structured lexical similarity via convolution kernels on dependency trees
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Probabilistic models of similarity in syntactic context
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A probabilistic interpretation of precision, recall and F-score, with implication for evaluation
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Verb classification using distributional similarity in syntactic and semantic structures
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Automatically constructing a normalisation dictionary for microblogs
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
An empirical investigation of statistical significance in NLP
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Lexical normalization for social media text
ACM Transactions on Intelligent Systems and Technology (TIST) - Special section on twitter and microblogging services, social recommender systems, and CAMRa2010: Movie recommendation in context
Towards unsupervised learning of temporal relations between events
Journal of Artificial Intelligence Research
Learning to rank from structures in hierarchical text classification
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Word sense and semantic relations in noun compounds
ACM Transactions on Speech and Language Processing (TSLP) - Special issue on multiword expressions: From theory to practice and use, part 2
Multilingual joint parsing of syntactic and semantic dependencies with a latent variable model
Computational Linguistics
Hi-index | 0.00 |
Statistical significance testing of differences in values of metrics like recall, precision and balanced F-score is a necessary part of empirical natural language processing. Unfortunately, we find in a set of experiments that many commonly used tests often underestimate the significance and so are less likely to detect differences that exist between different techniques. This underestimation comes from an independence assumption that is often violated. We point out some useful tests that do not make this assumption, including computationally-intensive randomization tests.