Friday, March 19, 2010

Office Activity Recognition using Hand Posture Cues

Comments
Manoj
Drew

Summary
This paper focused primarily on determining if hand postures can be used to help determine the objects a user interacts with. Another goal was to determine how different users have different hand gestures for the same interactions. They used the CyberGlove device with sampling at 10 readings a second. 8 users participated in the experiment performing 12 interactions at 5 times each. They averaged the values of each of the 22 sensors for each interaction to input into the classifier. They decided on the 1 nearest neighbor algorithm for the classifier. In the user-independent system, they performed the leave-on-out cross-validation across all users. The average accuracy was 62.5%, ranging from 41.7% to 81.7%. In the user-dependent system, they trained the classifier using only one test user. They first chose one random example of each interaction to train the classifier and then ran the classifier on the remaining four examples in testing. Next, they did 2 examples to train. Average accuracy was 78.9% for one training example to 94.2% for 4 training examples. They determined that the user-dependent system was better for recognizing user interactions in a natural, unconstrained manner.

--------------------------------------------------------
Commentary
I think this paper proves their goals of determining whether or not hand postures can determine an interaction and seeing the variability in hand postures for the same interaction across different users. However, since the experiment used the CyberGlove, I don't see how this could be useful in practice, since I don't think all office workers would agree to being required to wear a glove. I think for most practical purposes (like security mentioned in the related works), a vision based system is more helpful.

--------------------------------------------------
Brandon Paulson, Tracy Hammond. Office Activity Recognition using Hand Posture Cues. The British Computer Society 2007.

$3 Gesture Recognizer

Summary
Since the assigned paper was more of a brief version of their longer paper, I'll summarize the longer paper. Their work was based on the $1 algorithm by Wobbrock. They extended the work to include 3D acceleration data. There are no exact positioning since acceleration data is clouded by noise and drift error. Their algorithm does not require library support and needs only minimal parameter adjustment and training, and provides a good recognition rate.

First, they determine the change in acceleration by obtaining an acceleration delta. The summations of the deltas would give the gesture trace in 3D or projected into 2D. To match the gesture class, they compare the trace at point i of the input to all the traces of all training gestures in the library and generate a score table comparing the two. For resampling, they settled for 150 points. They also rotate along the indicative angle and scaled to fit in a normalized cube of 100^3 units to compensate for scaling differences.

The scoring heuristic reduces the number of false positives. They determine a threshold score. If the highest score is higher than 1.1 times this threshold, they return the gesture ID. If 2 out of the top 3 are of the same gesture class and scores higher than .95 times the threshold, return the ID of the gestures.

They evaluated the algorithm on twelve participants, making 10 unique gesture classes. Each class was entered 15 times on the wiimote. The recognition algorithm took the first 5 as training sets, and then compared the remaining ten. The recognition rate was between 58% and 98% with an average of 80%. The scoring heuristic worked since only 8% of all detected gestures were false positives.

Some of the limitations of this algorithm was that only explicitly started and ended were recognized. Also, the size of the library is a limiting factor since the computational overhead would start growing as the library gets larger.
----------------------------------
Commentary
I think this is a good followup on the $1 algorithm. It could use some improvement since 80% seems a bit low. But since this is 3D recognition, there may be additional problems involved. Overall, I think this is useful.

----------------------------------------
Sven Kratz, Michael Rohs. A $3 Gesture Recognizer - Simple Gesture Recognition for Devices Equipped with 3D Acceleration Sensors. IUI 2010

Thursday, March 18, 2010

$1 Recognizer

Summary
This paper seeks to create a gesture recognizer that would allow novice programmers to incorporate gestures into their UI. The $1 algorithm is easy, cheap, and usable anywhere. It involves only basic geometry and trigonometry and requires about 100 lines of code. It supports configurable rotation, scale, and position invariance, does not require feature selection or training examples, is resilient to variations in input sampling, and supports high recognition rates.

The algorithm has 4 steps.
  1. Resample the Point Path. It resamples the gestures by splitting the path into N equidistant points.
  2. Rotate Once Based on the Indicative Angle. The indicative angle is the angle formed between the centroid of the gesture and the gesture's first point. The gesture is rotated so that this angle is 0 degrees.
  3. Scale and translate. The gesture is scaled to a reference square and then translated to a reference point (the origin of the frame)
  4. Find the optimal Angle for the Best Score. A candidate is compared to each stored template to find the average distance between corresponding points.
The recognizer cannot distinguish gestures that depends on orientations, aspect ratios, or locations. Horizontal and vertical lines are abused by non-uniform scaling. Also, the recognizer does not distinguish based on time.

The user study consisted of 10 subjects using a Pocket PC with a stylus. They were given a series of gestures to do at slow, medium, and fast speeds. They compared the recognizer with Rubine and Dynamic Time Warping. $1 had a 99.02% accuracy. The number of templates affected the recognition error rate. $1 improved as more templates were added, with an error rate from 2.73% at 1 template to 0.45% at 9 templates. Slow and fast gestures had higher errors than medium. $1 took 1.6 minutes to run 14400 tests for 160 gestures.
---------------------------------------
Commentary
I like this algorithm. It is easy and fast. However, there are some drawbacks (ie the limitations listed). I think that there has to be some drawbacks for a "simple" algorithm. In order to simplify things, you have to leave some things out, otherwise it would be too complicated. I liked the paper since it went into the quantitative and qualitative aspects of the experiment.

--------------------------------------------------------
Jacob O. Wobbrock, Andrew D. WIlson, Yang Li. Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes. UIST 2007.

American Sign Language Recognition in Game Development for Deaf Children

Summary
The authors of this paper created CopyCat, an American Sign Language game, which uses gesture recognition to help young deaf children learn sign language. One of the main problems in this area is continuous, user independent sign language recognition in classroom settings. They implemented a Wizard of Oz version of CopyCat and collected data from deaf children who used the system. They attempted to overcome the problems in continuous signing, such as clothing and skin tone differences and changes in illumination in the classroom. The dataset consisted of 541 phrase samples and 1959 individual sign samples.

Their solution used color histogram adaptation for hand segmentation and tracking. The children would wear colored gloves with wireless accelerometers. Both data is used to train hidden Markov models for recognition. The game is interactive. It has tutorial videos to demonstrate the correct signs, live video, and an animated character mimicking what the child is doing. They evaluated the result using the leave-one-out technique, by iterating through each child and removing him from the data set, training the data on the four other children, and testing it on the remaining child's data. They achieved between 73 to 92% accuracy.

Commentary
I think this is a great application for the research area. It is helpful to teach deaf children how to do sign language, especially if their parents aren't fluent in it as well. I think future work will improve on the interface and allow the programmers to get away from the Wizard. Also, the dataset was filtered out to include only the clear signs. In everyday use, that would not be possible. Also, the software needs to deal with individuality in signs. Not everyone would do the signs the exact same way. There are always variations. Overall, though, I enjoyed this paper. A very useful application for HCI.
---------------------------------------------
Helene Brashear, Valerie Henderson, Kwang-Hyn Park, Harley Hamilton, Seungyon Lee, Thad Starner. American Sign Language Recognition in Game Development for Deaf Children. ASSET 2006.

An Empirical Evaluation of Touch and Tangible Interfaces for Tabletop Displays

Comments
Franck Norman
Murat

Summary
The authors wanted to conduct a performance study on tangible interfaces and determine if there is an actual improvement in tangible interfaces compared to other ones. They compared speed and error rates for a touch interface and a tangible interface. They built a top projection tabletop system that can support both a touch and tangible interface. There is a camera mounted on top of the table to detect tagged objects placed on the table using ARTag. The table can also track multiple fingers. The experiment used a shelf and a wall. The touch interface had a toolbar that contains items to drag and drop into the work area. The tangible interface had real tangible objects. There were several interaction methods provided by the interface: addition, lasso selection, translation, etc.. The user study took 40 students and each had to implement a series of 40 layouts using both interfaces. The experiment showed that the users were faster with the tangible interface. They also concluded that manipulating the tangible shelves was much easier compared to the tangible walls. The user preference section indicated that the tangible interface was easy to use, but had more "fun" with the touch interface. Also, users stated that they were more stressed and irritated by the touch interface.

-----------------------------------
Discussion
I think that this is a pretty good intro study into tangible interfaces. It is intuitive and is faster as the result shows. Speaking of results, this paper was also good since it went into depth about the results from the experiment. There were quantitative results (the error and completion time) and qualitative results (fatigue, easy to use, fun).

---------------------------------------
Aurelien Lucchi, Patrick Jermann, Guillaume Zufferey, Pierre Dillenbourg. An Empirical Evaluation of Touch and Tangible Interfaces for Tabletop Displays. TEI 2010.

Wednesday, March 17, 2010

Non-contact Method for Producing Tactile Sensation Using Airborne Ultrasound

Comments
Franck Norman
Drew

Summary
The authors sought to create a new method of interacting with 3D objects with tactile feedback. Although there has been previous methods implemented, one method, the Cyber Glove, is not optimal because it provides tactile feedback at all times from the glove touching the skin. They proposed using ultrasound as the way to provide tactile feedback. Their method is based on acoustic radiation pressure. When the airborne ultrasound is applied on the surface of the skin, almost 99% of the incident acoustic energy is reflected on the skin. This removes the need to place an ultrasound reflective medium on the skin. Their prototype device consists of an annular array of airborne ultrasound trandsucers, a 12 channels amplifier circuit, and a PC. It was designed to produce a single focal point along the center axis perpendicular to the radiation surface.

They measured the total force using an electronic balance. The measured force was 0.8 gf (gram-force) and 2.9 gf at 250 mm and 0 mm respectively. To measure the spatial resolution, a microphone probe was attached to an XYZ stage. They measured that the diameter of the focal point was about 20 mm in diameter, and the maximum intensity decreases as the distance of the focal point from the array increases. When they measured the temporal properties of the radiation pressure, they noticed that the radation pressure decreases at the onset of radiation pressure each half period.

They did a user study and said that they felt vibrations when the radiation pressure was modulated. If it was constant, they can only feel on and off pressure.

----------------------------------
Discussion
I feel like that this experiment is not going to be very useful in future applications. Although the idea is innovative, using ultrasound doesn't seem to be very useful in current interactions. There would also need to be a way to detect the position of the hand without the use of sensors on the hand itself. Also, within the paper, I would've liked to have seen more of a user study. It gives no statistics of what the study was. I assume that the participants just put their hand over the array and felt something.

-----------------------------
Takayuki Iwamoto, Mari Tatezono, Hiroyuki Shinoda. Non-contact Method for Producing Tactile Sensation Using Airborne Ultrasound. EuroHaptics 2008. pp. 504-513

Tuesday, March 16, 2010

FreeDrawer - A Free-Form Sketching System on the Responsive Workbench

Comments
Franck Norman
Manoj

Summary
The authors of this paper sought to use 3D tools for curve drawing and deformation techniques for curves and surfaces. In their setup, the user draws in a virtual environment, using a tracked stylus as an input device. At a workshop about the needs of designers in virtual environments, the modeler should follow certain guidelines
  1. be useful as a combined tool for the conceptual phase up to a certain degree of elaboration
  2. hide the mathematical complexity of object representations
  3. direct and real time interaction
  4. full scale modeling, large working volume
  5. be intuitive, easy to learn
Their modeler requires the designer to have some drawing skills, which is not too different from the traditional method. They support direct drawing of space curves and 2D curves, projected onto a virtual plane. The features included in the modeler also includes creating curve networks, changing a curve in a network, filling in the surface, smoothing, sharpening, and dragging curves, sculpting the surface, and creating surface patches. Their interface uses a hand held 3D widget, consisting of a set of virtual pointers starting at the stylus. The pointers will have a specific function like copy, move, delete, smooth, and sharpen. The user will touch the object with the tip of the corresponding pointer and pressing the stylus button afterward. Their user study consisted of one user who had experience with the program before. The user drew a seat.

Discussion
I think this application is okay for the specific area they were going for: the designers. It is understandable that one of the constraints of using the product well was having drawing skills. If this product is catered to the product designers, then those people should already have this skill. I think that the user interface could be better. From what I understand, the user moves the stylus and must touch the proper pointer on the curve. That seems quite difficult especially if there are a lot of curves in the area. Also, they only tested the product on one user. I would like to see some stats on multiple users to see how different designers would do the same task.

______________________________
Source: Gerold Wesche and Hans-Peter Seidel. FreeDrawer - A Free-Form Sketching System on the Responsive Workbench. VRST 2001.