On ordered weighted averaging aggregation operators in multicriteria decisionmaking
IEEE Transactions on Systems, Man and Cybernetics
C4.5: programs for machine learning
C4.5: programs for machine learning
The weighted majority algorithm
Information and Computation
On the inclusion of importances in OWA aggregations
The ordered weighted averaging operators
A Winnow-Based Approach to Context-Sensitive Spelling Correction
Machine Learning - Special issue on natural language learning
Foundations of statistical natural language processing
Foundations of statistical natural language processing
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Learning trees and rules with set-valued features
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
A Network Analysis Model for Disambiguation of Names in Lists
Computational & Mathematical Organization Theory
International Journal of Data Mining and Bioinformatics
IdentityRank: Named Entity Disambiguation in the Context of the NEWS Project
ESWC '07 Proceedings of the 4th European conference on The Semantic Web: Research and Applications
International Journal of Data Mining and Bioinformatics
A text-mining technique for extracting gene-disease associations from the biomedical literature
International Journal of Bioinformatics Research and Applications
Author Name Disambiguation in Citations
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 03
CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
Name discrimination by clustering similar contexts
CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Unsupervised name ambiguity resolution using a generative model
EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Semantic annotation of biomedical literature using google
ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part III
IdentityRank: Named entity disambiguation in the news domain
Expert Systems with Applications: An International Journal
Hi-index | 0.01 |
We study the problems of disambiguation in natural language, focusing on the problem of gene vs. protein name disambiguation in biological text and also considering the problem of context-sensitive spelling error correction. We introduce a new family of classifiers based on ordering and weighting the feature vectors obtained from word counts and word co-occurrence in the text, and inspect several concrete classifiers from this family. We obtain the most accurate prediction when weighting by positions of the words in the context. On the gene/protein name disambiguation problem, this classifier outperforms both the Naive Bayes and SNoW baseline classifiers. We also study the effect of the smoothing techniques with the Naive Bayes classifier, the collocation features, and the context length on the classification accuracy and show that correct setting of the context length is important and also problem-dependent.