Implementing agglomerative hierarchic clustering algorithms for use in document retrieval
Information Processing and Management: an International Journal
Classifying news stories using memory based reasoning
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Automated learning of decision rules for text categorization
ACM Transactions on Information Systems (TOIS)
An example-based mapping method for text categorization and retrieval
ACM Transactions on Information Systems (TOIS)
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Some advances in transformation-based part of speech tagging
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
A comparison of classifiers and document representations for the routing problem
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
A survey of multilingual text retrieval
A survey of multilingual text retrieval
Combining classifiers in text categorization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Hierarchic document classification using Ward's clustering method
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
Feature selection, perceptron learning, and a usability case study for text categorization
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Using a generalized instance set for automatic text categorization
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Context-sensitive learning methods for text categorization
ACM Transactions on Information Systems (TOIS)
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Combination and boundary detection approaches on Chinese indexing
Journal of the American Society for Information Science - Special topic issue on digital libraries: part 2
Document clustering for electronic meetings: an experimental comparison of two techniques
Decision Support Systems - From information retrieval to knowledge management: enabling technologies and best practices
Partitioning-based clustering for Web document categorization
Decision Support Systems - Special issue on WITS '97
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Document organization using Kohonen's algorithm
Information Processing and Management: an International Journal
Maximizing Text-Mining Performance
IEEE Intelligent Systems
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Athena: Mining-Based Interactive Management of Text Database
EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
Automatic generation of English/Chinese thesaurus based on a parallel corpus in laws
Journal of the American Society for Information Science and Technology
Automatic construction of English/Chinese parallel corpora
Journal of the American Society for Information Science and Technology
An Association Thesaurus for Information Retrieval
An Association Thesaurus for Information Retrieval
A simple rule-based part of speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
Event detection from online news documents for supporting environmental scanning
Decision Support Systems - Special issue: Knowledge management technique
Error anaylsis of Chinese text segmentation using statistical approach
Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
Building parallel corpora by automatic title alignment using length-based and text-based approaches
Information Processing and Management: an International Journal
Cross-language text classification
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
An EM Based Training Algorithm for Cross-Language Text Categorization
WI '05 Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
A heuristic method based on a statistical approach for Chinese text segmentation
Journal of the American Society for Information Science and Technology
Detecting spam web pages through content analysis
Proceedings of the 15th international conference on World Wide Web
Exploiting comparable corpora and bilingual dictionaries for cross-language text categorization
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
WI-IATW '07 Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops
Effective spam filtering: A single-class learning and ensemble approach
Decision Support Systems
Opinion Mining and Sentiment Analysis
Foundations and Trends in Information Retrieval
Combining preference- and content-based approaches for improving document clustering effectiveness
Information Processing and Management: an International Journal
Automatic acquisition of chinese–english parallel corpus from the web
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
A multi-classifier system for text categorization
Proceedings of the 2011 ACM Symposium on Research in Applied Computation
Exploiting poly-lingual documents for improving text categorization effectiveness
Decision Support Systems
Hi-index | 0.00 |
Text categorization pertains to the automatic learning of a text categorization model from a training set of preclassified documents on the basis of their contents and the subsequent assignment of unclassified documents to appropriate categories. Most existing text categorization techniques deal with monolingual documents (i.e., written in the same language) during the learning of the text categorization model and category assignment (or prediction) for unclassified documents. However, with the globalization of business environments and advances in Internet technology, an organization or individual may generate and organize into categories documents in one language and subsequently archive documents in different languages into existing categories, which necessitate cross-lingual text categorization (CLTC). Specifically, cross-lingual text categorization deals with learning a text categorization model from a set of training documents written in one language (e.g., L"1) and then classifying new documents in a different language (e.g., L"2). Motivated by the significance of this demand, this study aims to design a CLTC technique with two different category assignment methods, namely, individual- and cluster-based. Using monolingual text categorization as a performance reference, our empirical evaluation results demonstrate the cross-lingual capability of the proposed CLTC technique. Moreover, the classification accuracy achieved by the cluster-based category assignment method is statistically significantly higher than that attained by the individual-based method.