The authors in this paper examined hand gestures made by a weather person narrating in front of a weather map. Since the gestures are embedded in the narration, they have plenty of data from an uncontrolled environment to study the interaction between speech and gesture.
They implemented a continuous HMM based gesture recognition framework. First they classified gestures as either pointing, area, or contour gestures. Then they chose a parameter space. Since it requires capturing the hand motions, they performed color segmentation on the stream of video input. Then, they determine distance between the face with the hands and the angle from the vertical. They used two multivariate Gaussians to model the output probability distribution at every state of each HMM. In the testing, continuous gesture recognition has lower recognition rate than isolated recognition.
They also did a co-occurrence analysis of different gestures with spoken keywords. With this, they were able to improve the continuous gesture recognition result based on the analysis of gestures with keywords.
-------------------------------------------------------
Discussion
I think this is a good study into how speech and gestures are related. It sounds promising, but the limited case studies they did could be expanded upon. I think that this is still a proof of concept kind of thing, so there was no real user study.
It was an interesting paper in which they try to incorporate naturalness in HCI interfaces.
ReplyDelete