C4.5: programs for machine learning
C4.5: programs for machine learning
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval
CogNet: Integrated information and knowledge management and its use in virtual organisations
Proceedings of the IFIP TC5/WG5.3 Second IFIP Working Conference on Infrastructures for Virtual Organizations: Managing Cooperation in Virtual Organizations and Electronic Busimess towards Smart Organizations: E-Business and Virtual Enterprises: Managing Business-to-Business Cooperation
On the Evaluation of Document Analysis Components by Recall, Precision, and Accuracy
ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
Text classification: a recent overview
ICCOMP'05 Proceedings of the 9th WSEAS International Conference on Computers
Using Intuitionistic Fuzzy Sets in Text Categorization
ICAISC '08 Proceedings of the 9th international conference on Artificial Intelligence and Soft Computing
Hi-index | 0.00 |
This paper presents a study where feature selection algorithms were evaluated in order to improve documents' classification performance. The study was made during the project DEEPSIA, IST project Nr. 1999-20 283, funded by the European Union. The need to improve documents recognition was imposed by the need to increase the overall performance of the Framework for Internet data collection based on intelligent agents, used within the project. The Framework is briefly described and the learning techniques used are presented. The focus of this paper is on the feature selection algorithms, where the most relevant work was the use of Conditional Mutual Information, estimated using genetic algorithms, since the computational complexity of CKN invalidated an iterative approach. Methods, techniques and comparative results are presented in detail.