A critical investigation of recall and precision as measures of retrieval system performance
ACM Transactions on Information Systems (TOIS)
Word association norms, mutual information, and lexicography
Computational Linguistics
Foundations of statistical natural language processing
Foundations of statistical natural language processing
The Case against Accuracy Estimation for Comparing Induction Algorithms
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
Retrieving collocations from text: Xtract
Computational Linguistics - Special issue on using large corpora: I
Parsing, word associations and typical predicate-argument relations
HLT '89 Proceedings of the workshop on Speech and Natural Language
Collocation extraction based on modifiability statistics
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Combining association measures for collocation extraction
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
ROC Curves for Continuous Data
ROC Curves for Continuous Data
Extending lexical association measures for collocation extraction
Computer Speech and Language
Using small random samples for the manual evaluation of statistical association measures
Computer Speech and Language
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Beyond accuracy, f-score and ROC: a family of discriminant measures for performance evaluation
AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
IEEE Transactions on Education
Hi-index | 0.00 |
Choosing the optimal threshold for the collocations extraction remains a manual task performed by experts. Until today, there is no serious work, based on deep studies, which explores possible solutions to automate the learning of the threshold in the statistical terminology field. In this paper, the authors try to spotlight on this problem by exploring, firstly, the evaluation performance techniques used in several scientific areas such as biomedical and biometric and applying them, subsequently, on the statistical terminology field. The experimental study gives promoters results. First, it shows the effectiveness of usual techniques such as ROC and Precision-Recall curves used to evaluate the performance of binary classification systems. Second, it provides a practical solution for automatic estimation of optimal thresholds for collocation extraction systems.