Sense beauty via face, dressing, and/or voice

Authors:
Tam V. Nguyen;Si Liu;Bingbing Ni;Jun Tan;Yong Rui;Shuicheng Yan
Affiliations:
National University of Singapore, Singapore, Singapore;National Laboratory of Pattern Recognition, Beijing, China;Advanced Digital Sciences Center, Singapore, Singapore;National University of Defense Technology, Hunan, China;Microsoft Research Asia, Beijing, China;National University of Singapore, Singapore, Singapore
Venue:
Proceedings of the 20th ACM international conference on Multimedia
Year:
2012

Citing 13
Cited 3

Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope

International Journal of Computer Vision
Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns

IEEE Transactions on Pattern Analysis and Machine Intelligence
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Robust Real-Time Face Detection

International Journal of Computer Vision
Histograms of Oriented Gradients for Human Detection

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Facial Attractiveness: Beauty and the Machine

Neural Computation
FaceTracer: A Search Engine for Large Collections of Images with Faces

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part IV
Web image mining towards universal age estimator

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Learning to photograph

Proceedings of the international conference on Multimedia
Automatic attribute discovery and characterization from noisy web data

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
Predicting facial beauty without landmarks

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part VI
Predicting occupation via human clothing and contexts

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Describing people: A poselet-based approach to attribute classification

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision

Towards decrypting attractiveness via multi-modality cues

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Static saliency vs. dynamic saliency: a comparative study

Proceedings of the 21st ACM international conference on Multimedia
Cost-sensitive ordinal regression for fully automatic facial beauty assessment

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Discovering the secret of beauty has been the pursuit of artists and philosophers for centuries. Nowadays, the computational model for beauty estimation has been actively explored in computer science community, yet with the focus mainly on facial features. In this work, we perform a comprehensive study of female attractiveness conveyed by single/multiple modalities of cues, i.e., face, dressing and/or voice, and aim to uncover how different modalities individually and collectively affect the human sense of beauty. To this end, we collect the first Multi-Modality Beauty (M2B) dataset in the world for female attractiveness study, which is thoroughly annotated with attractiveness levels converted from manual k-wise ratings and semantic attributes of different modalities. A novel Dual-supervised Feature-Attribute-Task (DFAT) network is proposed to jointly learn the beauty estimation models of single/multiple modalities as well as the attribute estimation models. The DFAT network differentiates itself by its supervision in both attribute and task layers. Several interesting beauty-sense observations over single/multiple modalities are reported, and the extensive experimental evaluations on the collected M2B dataset well demonstrate the effectiveness of the proposed DFAT network for female attractiveness estimation.