Artificial intelligence: a modern approach
Artificial intelligence: a modern approach
The nature of statistical learning theory
The nature of statistical learning theory
A survey of multilingual text retrieval
A survey of multilingual text retrieval
Making large-scale support vector machine learning practical
Advances in kernel methods
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Cross-language text classification
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
An EM Based Training Algorithm for Cross-Language Text Categorization
WI '05 Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
Cross language text categorization by acquiring multilingual domain models from comparable corpora
ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Towards a universal wordnet by learning from combined evidence
Proceedings of the 18th ACM conference on Information and knowledge management
Using Nearest Neighbor Information to Improve Cross-Language Text Classification
MICAI '09 Proceedings of the 8th Mexican International Conference on Artificial Intelligence
Using information from the target language to improve crosslingual text classification
IceTAL'10 Proceedings of the 7th international conference on Advances in natural language processing
Constructing and utilizing wordnets using statistical methods
Language Resources and Evaluation
Hi-index | 0.00 |
In this paper, we investigate strategies for automatically classifying documents in different languages thematically, geographically or according to other criteria. A novel linguistically motivated text representation scheme is presented that can be used with machine learning algorithms in order to learn classifications from pre-classified examples and then automatically classify documents that might be provided in entirely different languages. Our approach makes use of ontologies and lexical resources but goes beyond a simple mapping from terms to concepts by fully exploiting the external knowledge manifested in such resources and mapping to entire regions of concepts. For this, a graph traversal algorithm is used to explore related concepts that might be relevant. Extensive testing has shown that our methods lead to significant improvements compared to existing approaches.