Saturday, May 15, 2010

Ethnography / dancing ideas

Ethnography
I think that for an ethnography, they can study how people who have differing ideas from the majority in a forum (a liberal posting in a far right wing forum) are treated in their posts. For example, in a thread, see how quickly minority opinions are answered with hateful remarks. This is very similar to a project that Sarah and I did in the CHI class a few years ago.

Salsa Dancing
Other than having a similar idea to TIKL, where students learn the correct moves by having vibrotactile sensors vibrate when they move off beat or not closely match the teacher's movement, I don't really see how teaching students how to dance can be improved by a haptic device.

Friday, May 14, 2010

Xwand: UI for Intelligent Spaces

Comments
Franck
Manoj

Summary
The authors created the XWand wireless sensor package. It "enables styles of natural interaction with intelligent environments." If the user wants to control a device, they just point and perform simple gestures. THe system would rely on intelligence of the environment to determine the user's intention.

The device itself has the following
  • 2-axis MEMS accelerometer
  • 3-axis magnetoresistive permalloy magnetometer
  • 1-axis piezoelectric gyroscope
  • FM transceiver
  • flash-programmable microcontroller
  • Infra-red LED
  • green and red visible LED
  • pushbutton
  • 4 AAA batteries
In the user study, pointing accuracy increased when audio feedback was added. From the study, users are comfortable with good tracking, but if there isn't good tracking, audio feedback would help.

Analysis
I don't know much about the hardware aspect of the device. However, it does seem intuitive. We like to point and control objects.

Webcam Mouse Using Face and Eye Tracking in Various Illumination Environments

Comments
Franck
Manoj

Summary
In this study, the authors present an illumination recognition technique that combines K-Nearest Neighbor classifier and adaptive skin model to provide real-time tracking. Their accuracy reached over 92% in different environments. The system tracks face and eyes features at 15 fps under standard notebook platforms. The system permits user to define and add their favorite environments to the nearest neighbor classifier, but it comes with 5 initial environments. They used the Webcam Mouse system on a laptop PC with a Logitech Webcam. The system takes up 45% of window resources.

Analysis
I think this requires more usability tests to see if their methods is robust enough for different environments. However, I did not think the KNN classifier was useful in this area.

Real-Time Hand Tracking as a User Input Device

Comments
Franck
Manoj

Summary
The authors sought to create an easy-to-use and inexpensive system that facilitates 3D articulated user-input using the hands. THe system optically tracks an ordinary cloth glove that has a specific pattern on it. This simplifies the pose estimation problem. First, they construct a database of synthetically generated hand poses. These images are normalized and downsampled to a small size. THey do the same for a queried image from the camera. They use it to look up the nearest neighbor pose in the database, by defining a distance metric between two tiny images. They chose a Hausdorff-like distance to do this.

Analysis
I agree this is a cost effective method of tracking the hand. I am uncertain about the way they choose to measure the distance metrics they use. Well, I don't know anything about them so I do not know the efficiencies of those methods.

That one there! Pointing to establish device identity

Comments
Franck
Manoj

Summary
The authors sought to identify devices using a pointing gesture involving custom tags and stylus called the gesturePen. They chose a two-way line-of-sight communications link to interact with devices in a dense computing environment. The pen had a irDA compliant infrared transceiver and developed tags that can be fixed to active and passive devices. The communication flow between the pen, a tag, and the associated device is:
  • the user points the pen towards a tag and presses a button on the pen
  • the tag receives the ping message, blinks the light, and sends its identity information to the pen.
  • it receives the ID, validates it, and sends it to the attached device.
  • Information is transferred over the network to the other device.
They tested their device with some user studies. First was a cognitive load task, where the users would play a jigsaw puzzle on an handheld computer using the gesturePen as a normal stylus. Then the participant would be interrupted to choose a tag by reading the IP address lab and choosing it from the list, or pointing towards the tag with the pen and clicking a button. The next study was a mobile environment task, where the participant was required to select the target device. Their method choices were the same as for the first test. For the most part, the gesturePen, was well suited to the dynamic ubiquitous computing environments. The users learned the system quickly and became comfortable using it fast.

Analysis
It seems like an innovative way of selecting a device. I'm surprised it is not that common now. The only problem I see is the limited range and maybe confusion if there are two devices close by. Anyways, it seems like you would need to be close for this device to be accurate.

Human/Robot Interfaces

Comments
Franck

Summary
The paper pretty much is about designing a system that can recognize gestures interactively and learn new gestures with only a few training examples.

Their approach is automated generation and iterative training of a set of Hidden Markov Models. These models are assumed to be discrete. For gesture recognition, the process involves reducing the data from the glove to a sequence of discrete observable symbols.

The procedure for interactive learning:
  1. the user makes a series of gestures.
  2. The system segments the data stream into separate gestures and tries to classify each one. If it is certain, then it would perform the associated gesture. If not, it would ask the user to confirm.
  3. The system adds the encoded gesture to the example list.

They use the Baum-Welch algorithm to recognize the gestures and automatically update the HMM

They decided to focus on the gesture recognition and the way to generate new HMMs. They decided to solve the new HMM problem by beginning with one or a small number of examples. Then run Baum Welch until it converges, iteratively add more examples, and update the model with Baum Welch after each example.

To process the signal, they needed to represent the gesture as a sequence of discrete symbols. They decided to view the hand as a single dimensional sequence of symbols.They chose vector quantization as the algorithm to preprocess the data.

The vector quantizer encodes a vector by returning the index of a vector in the codebook that is closest to the vector. The preprocessor is coded as a data filter, so a symbol is sent to the recognition system as soon as enough data has been read to generate it.

After the data is preprocessed, it is evaluated by all HMMs for the recognition process, and is used to update the parameters of the proper HMM. They used 5-state Bakis HMMs, where a transition from a state can only go to that state or the 2 next states. They defined a confidence measure, where if the returned value is negative, the model is correct. if it is between -1 and 1, the classification maybe wrong. if it's less than -2.. then it is certain.

They tested on 14 letters of the alphabet in sign language. They didn't worry about measurements in 6D for the hand, so they chose the gestures that will not be confused with another one. The results show that their algorithm is reliable.

Discussion
I think their method is much better than the one I used in my project. However, I think they kind of took the easy way out in their testing by only the letters in the alphabet that would be easiest. I would have liked to see how it would work with all letters.

Real-Time RObust Body Part Tracking for Augmented Reality Interface

This paper seeks to provide an interface to track body parts without limiting the user's freedom. The system can recognize whether the user wears long sleeves or short sleeves. They use a calibrated camera to obtain images of hands, head, and feet. They transfer 2D detected body parts in an approximate 3D posture. Their algorithm is the following (pretty much copied here):
  • obtain a foreground image by deleting the background and shadow of the user from the original image
  • using the face texture, detect a face from the image.
  • extract contour features
  • tracks the users head using a particle filter
  • detect and tracks the two hands by segmenting the skin blob image
  • using the contour of the lower body, detect and track the feet and estimate the 3D body pose
  • extract meaningful gestures with the position of the right hand
  • visualizes all augmented objects and the user
They performed an experiment, evaluating 2D tracking performance with short and long sleeves, separately. They used an a BeNature system that recognizes simple gestures. They calculated the error when the user wears long sleeves is 5.48 pixels and 9.16 with short sleeves.

Analysis
I think this is quite unique method of tracking the human body, well, the hands, feet, and head. I like the robustness of the system in detecting whether the user is wearing long or short sleeves. However, I would like to see more user studies to see if this can be used for other purposes.