Modelling and recognition of the linguistic components in American Sign Language
Image and Vision Computing
Spatial and temporal pyramids for grammatical expression recognition of American sign language
Proceedings of the 11th international ACM SIGACCESS conference on Computers and accessibility
Hi-index | 0.00 |
The design of systems that can analyze and recognize American Sign Language (ASL) sentences, require that motion patterns and handshapes be independently represented from one another. In this paper, we define an algorithm that first obtains the 3D shape of each of the linguistically significant fingers (known as "selected" fingers in the linguistic literature), and then represents the motion of the hand as a three-dimensional trajectory in the coordinate system of the camera. In doing so, we address the problems caused by self-occlusions and localization errors. The 3D shape is obtained using an affine structure from motion algorithm that is based on the linear fitting of matrices with missing data. To recover the 3D motion of the hand, we define a robust algorithm that selects the most stable solution from the pool of all the solutions given by the three point resection problem. To be robust to localization errors (of the hand feature points tracked) we define a simple algorithm that searches for that combination of coordinates that gives the largest number of stable solutions in our system of equations. To validate our theory, we provide a set of experimental results using video sequences of ASL signs.