Kendall's advanced theory of statistics
Kendall's advanced theory of statistics
Class-based n-gram models of natural language
Computational Linguistics
Generalized probabilistic LR parsing of natural language (Corpora) with unification-based grammars
Computational Linguistics - Special issue on using large corpora: I
A stochastic parts program and noun phrase parser for unrestricted text
ANLC '88 Proceedings of the second conference on Applied natural language processing
GPSM: a Generaized Probabilistic Semantic Model for ambiguity resolution
ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
Semantic and syntactic aspects of score function
COLING '88 Proceedings of the 12th conference on Computational linguistics - Volume 2
Automatic model refinement: with an application to tagging
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 1
Formal languages and their relation to automata
Formal languages and their relation to automata
Using corpus statistics and WordNet relations for sense identification
Computational Linguistics - Special issue on word sense disambiguation
NeMLaP3/CoNLL '98 Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
Statistical approaches to natural language processing generally obtain the parameters by using the maximum likelihood estimation (MLE) method. The MLE approaches, however, may fail to achieve good performance in difficult tasks, because the discrimination and robustness issues are not taken into consideration in the estimation processes. Motivated by that concern, a discrimination-and robustness-oriented learning algorithm is proposed in this paper for minimizing the error rate. In evaluating the robust learning procedure on a corpus of 1,000 sentences, 64.3% of the sentences are assigned their correct syntactic structures, while only 53.1% accuracy rate is obtained with the MLE approach.In addition, parameters are usually estimated poorly when the training data is sparse. Smoothing the parameters is thus important in the estimation process. Accordingly, we use a hybrid approach combining the robust learning procedure with the smoothing method. The accuracy rate of 69.8% is attained by using this approach. Finally, a parameter tying scheme is proposed to tie those highly correlated but unreliably estimated parameters together so that the parameters can be better trained in the learning process. With this tying scheme, the number of parameters is reduced by a factor of 2,000 (from 8.7 x 108 to 4.2 x 105), and the accuracy rate for parse tree selection is improved up to 70.3% when the robust learning procedure is applied on the tied parameters.