Inner-block operations on compressed images
Proceedings of the third ACM international conference on Multimedia
CVEPS - a compressed video editing and parsing system
MULTIMEDIA '96 Proceedings of the fourth ACM international conference on Multimedia
Closed-Loop MPEG Video Rendering
ICMCS '97 Proceedings of the 1997 International Conference on Multimedia Computing and Systems
Compressed Domain Transcoding of MPEG
ICMCS '98 Proceedings of the IEEE International Conference on Multimedia Computing and Systems
Manipulation and compositing of MC-DCT compressed video
IEEE Journal on Selected Areas in Communications
Automatic Closed Caption Detection and Font Size Differentiation in MPEG Video
VISUAL '02 Proceedings of the 5th International Conference on Recent Advances in Visual Information Systems
Manipulating lossless video in the compressed domain
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Hi-index | 0.00 |
The (cinema) caption processing that adds descriptive texts on the sequence of frames is an important video manipulation function that video editor should support. This paper proposes an efficient MC-DCT compressed domain approach to insert the caption into the MPEG-compressed video stream. It basically adds the DCT blocks of the caption image to the corresponding DCT blocks of the input frames one by one in MC-DCT domain as in [5]. However, the strength of the caption image is adjusted in the DCT domain to prevent the resulting DCT coefficients from exceeding the maximum value that is allowed in MPEG. In order to adjust the strength of caption image adaptively, we should know the exact pixel values of input image that is a difficult task in DCT domain. We propose an approximation scheme for the pixel values in which the DC value of a block is used as the expected pixel value for all pixels in that block. Although this approximation may lead some errors in the caption area, it still provides a relatively high image quality in the non-caption area, while the processing time is about 4.9 times faster than the decode-captioning-reencode approach.