The nature of statistical learning theory
The nature of statistical learning theory
Boosting support vector machines for text classification through parameter-free threshold relaxation
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
LIBLINEAR: A Library for Large Linear Classification
The Journal of Machine Learning Research
Improving machine translation quality with automatic named entity recognition
EAMT '03 Proceedings of the 7th International EAMT workshop on MT and other Language Technology Tools, Improving MT through other Language Technology Tools: Resources and Tools for Building MT
Infobox suggestion for Wikipedia entities
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
In this paper, a method is presented to recognize multilingual Wikipedia named entity articles. This method classifies multilingual Wikipedia articles using a variety of structured and unstructured features and is aided by cross-language links and features in Wikipedia. Adding multilingual features helps boost classification accuracy and is shown to effectively classify multilingual pages in a language independent way. Classification is done using Support Vectors Machine (SVM) classifier at first, and then the threshold of SVM is adjusted in order to improve the recall scores of classification. Threshold adjustment is performed using beta-gamma threshold adjustment algorithm which is a post learning step that shifts the hyperplane of SVM. This approach boosted recall with minimal effect on precision.