On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey
IEEE Transactions on Pattern Analysis and Machine Intelligence
A Database for Handwritten Text Recognition Research
IEEE Transactions on Pattern Analysis and Machine Intelligence
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
The IRESTE On/Off (IRONOFF) Dual Handwriting Database
ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
Collection and Analysis of On-line Handwritten Japanese Character Patterns
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Online Recognition of Chinese Characters: The State-of-the-Art
IEEE Transactions on Pattern Analysis and Machine Intelligence
Databases for Research on Recognition of Handwritten Characters of Indian Scripts
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Corpus-based HIT-MW database for offline recognition of general-purpose Chinese handwritten text
International Journal on Document Analysis and Recognition
CASIA-OLHWDB1: A Database of Online Handwritten Chinese Characters
ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
HCL2000 - A Large-scale Handwritten Chinese Character Database for Handwritten Character Recognition
ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
Databases and competitions: strategies to improve Arabic recognition systems
SACH'06 Proceedings of the 2006 conference on Arabic and Chinese handwriting recognition
Hi-index | 0.00 |
This paper proposes an opening recognition corpus, HIT-OR3C, and its construction toolkit to facilitate the unconstrained online Chinese handwriting text recognition. The characters of HIT-OR3C are collected through handwriting pad and are recorded and labeled automatically via the proposed handwriting document collection software OR3C Toolkit. HIT-OR3C consists of 5 subsets, namely GB1, GB2, Letter, Digit and Document. The first 4 corpora contain 6,825 categories produced by 122 persons and 832,650 samples in total. The document corpus is corresponding to 10 news articles that contain 2,442 categories produced by 20 persons and 77,168 samples in total. HIT-OR3C can be used for training and evaluation of character recognition algorithms. The OR3C Toolkit provides an efficient, device-independent, and unconstrained platform for the building of large scale handwriting corpus.