Identifying Script onWord-Level with Informational Confidenc
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
A confidence paradigm for classification systems with out-of-library considerations
Intelligent Decision Technologies
Hi-index | 0.00 |
Classifier combination has turned out to be a powerful tool for achieving high recognition rates, especially in fields where the development of a powerful single classifier system requires considerable efforts. However, the intensive investigation of multiple classifier systems has not resulted in a convincing theoretical foundation yet. Lacking proper mathematical concepts, many systems still use empirical heuristics and ad hoc combination schemes. My paper presents an information-theoretical framework for combining confidence values generated by different classifiers. The main idea is to normalize each confidence value in such a way that it equals its informational content. Based on Shannonýs notion of information, I measure information by means of a performance function that estimates the classification performance for each confidence value on an evaluation set. Having equalized each confidence value with the information actually conveyed, I can use the elementary sum-rule to combine confidence values of different classifiers. Experiments for combined on-line/off-line Japanese character recognition show clear improvements over the best single recognition rate.