Comments
Franck
Summary
The paper pretty much is about designing a system that can recognize gestures interactively and learn new gestures with only a few training examples.
Their approach is automated generation and iterative training of a set of Hidden Markov Models. These models are assumed to be discrete. For gesture recognition, the process involves reducing the data from the glove to a sequence of discrete observable symbols.
The procedure for interactive learning:
- the user makes a series of gestures.
- The system segments the data stream into separate gestures and tries to classify each one. If it is certain, then it would perform the associated gesture. If not, it would ask the user to confirm.
- The system adds the encoded gesture to the example list.
They use the Baum-Welch algorithm to recognize the gestures and automatically update the HMM
They decided to focus on the gesture recognition and the way to generate new HMMs. They decided to solve the new HMM problem by beginning with one or a small number of examples. Then run Baum Welch until it converges, iteratively add more examples, and update the model with Baum Welch after each example.
To process the signal, they needed to represent the gesture as a sequence of discrete symbols. They decided to view the hand as a single dimensional sequence of symbols.They chose vector quantization as the algorithm to preprocess the data.
The vector quantizer encodes a vector by returning the index of a vector in the codebook that is closest to the vector. The preprocessor is coded as a data filter, so a symbol is sent to the recognition system as soon as enough data has been read to generate it.
After the data is preprocessed, it is evaluated by all HMMs for the recognition process, and is used to update the parameters of the proper HMM. They used 5-state Bakis HMMs, where a transition from a state can only go to that state or the 2 next states. They defined a confidence measure, where if the returned value is negative, the model is correct. if it is between -1 and 1, the classification maybe wrong. if it's less than -2.. then it is certain.
They tested on 14 letters of the alphabet in sign language. They didn't worry about measurements in 6D for the hand, so they chose the gestures that will not be confused with another one. The results show that their algorithm is reliable.
Discussion
I think their method is much better than the one I used in my project. However, I think they kind of took the easy way out in their testing by only the letters in the alphabet that would be easiest. I would have liked to see how it would work with all letters.