Saturday, May 15, 2010

Ethnography / dancing ideas

Ethnography
I think that for an ethnography, they can study how people who have differing ideas from the majority in a forum (a liberal posting in a far right wing forum) are treated in their posts. For example, in a thread, see how quickly minority opinions are answered with hateful remarks. This is very similar to a project that Sarah and I did in the CHI class a few years ago.

Salsa Dancing
Other than having a similar idea to TIKL, where students learn the correct moves by having vibrotactile sensors vibrate when they move off beat or not closely match the teacher's movement, I don't really see how teaching students how to dance can be improved by a haptic device.

Friday, May 14, 2010

Xwand: UI for Intelligent Spaces

Comments
Franck
Manoj

Summary
The authors created the XWand wireless sensor package. It "enables styles of natural interaction with intelligent environments." If the user wants to control a device, they just point and perform simple gestures. THe system would rely on intelligence of the environment to determine the user's intention.

The device itself has the following
  • 2-axis MEMS accelerometer
  • 3-axis magnetoresistive permalloy magnetometer
  • 1-axis piezoelectric gyroscope
  • FM transceiver
  • flash-programmable microcontroller
  • Infra-red LED
  • green and red visible LED
  • pushbutton
  • 4 AAA batteries
In the user study, pointing accuracy increased when audio feedback was added. From the study, users are comfortable with good tracking, but if there isn't good tracking, audio feedback would help.

Analysis
I don't know much about the hardware aspect of the device. However, it does seem intuitive. We like to point and control objects.

Webcam Mouse Using Face and Eye Tracking in Various Illumination Environments

Comments
Franck
Manoj

Summary
In this study, the authors present an illumination recognition technique that combines K-Nearest Neighbor classifier and adaptive skin model to provide real-time tracking. Their accuracy reached over 92% in different environments. The system tracks face and eyes features at 15 fps under standard notebook platforms. The system permits user to define and add their favorite environments to the nearest neighbor classifier, but it comes with 5 initial environments. They used the Webcam Mouse system on a laptop PC with a Logitech Webcam. The system takes up 45% of window resources.

Analysis
I think this requires more usability tests to see if their methods is robust enough for different environments. However, I did not think the KNN classifier was useful in this area.

Real-Time Hand Tracking as a User Input Device

Comments
Franck
Manoj

Summary
The authors sought to create an easy-to-use and inexpensive system that facilitates 3D articulated user-input using the hands. THe system optically tracks an ordinary cloth glove that has a specific pattern on it. This simplifies the pose estimation problem. First, they construct a database of synthetically generated hand poses. These images are normalized and downsampled to a small size. THey do the same for a queried image from the camera. They use it to look up the nearest neighbor pose in the database, by defining a distance metric between two tiny images. They chose a Hausdorff-like distance to do this.

Analysis
I agree this is a cost effective method of tracking the hand. I am uncertain about the way they choose to measure the distance metrics they use. Well, I don't know anything about them so I do not know the efficiencies of those methods.

That one there! Pointing to establish device identity

Comments
Franck
Manoj

Summary
The authors sought to identify devices using a pointing gesture involving custom tags and stylus called the gesturePen. They chose a two-way line-of-sight communications link to interact with devices in a dense computing environment. The pen had a irDA compliant infrared transceiver and developed tags that can be fixed to active and passive devices. The communication flow between the pen, a tag, and the associated device is:
  • the user points the pen towards a tag and presses a button on the pen
  • the tag receives the ping message, blinks the light, and sends its identity information to the pen.
  • it receives the ID, validates it, and sends it to the attached device.
  • Information is transferred over the network to the other device.
They tested their device with some user studies. First was a cognitive load task, where the users would play a jigsaw puzzle on an handheld computer using the gesturePen as a normal stylus. Then the participant would be interrupted to choose a tag by reading the IP address lab and choosing it from the list, or pointing towards the tag with the pen and clicking a button. The next study was a mobile environment task, where the participant was required to select the target device. Their method choices were the same as for the first test. For the most part, the gesturePen, was well suited to the dynamic ubiquitous computing environments. The users learned the system quickly and became comfortable using it fast.

Analysis
It seems like an innovative way of selecting a device. I'm surprised it is not that common now. The only problem I see is the limited range and maybe confusion if there are two devices close by. Anyways, it seems like you would need to be close for this device to be accurate.

Human/Robot Interfaces

Comments
Franck

Summary
The paper pretty much is about designing a system that can recognize gestures interactively and learn new gestures with only a few training examples.

Their approach is automated generation and iterative training of a set of Hidden Markov Models. These models are assumed to be discrete. For gesture recognition, the process involves reducing the data from the glove to a sequence of discrete observable symbols.

The procedure for interactive learning:
  1. the user makes a series of gestures.
  2. The system segments the data stream into separate gestures and tries to classify each one. If it is certain, then it would perform the associated gesture. If not, it would ask the user to confirm.
  3. The system adds the encoded gesture to the example list.

They use the Baum-Welch algorithm to recognize the gestures and automatically update the HMM

They decided to focus on the gesture recognition and the way to generate new HMMs. They decided to solve the new HMM problem by beginning with one or a small number of examples. Then run Baum Welch until it converges, iteratively add more examples, and update the model with Baum Welch after each example.

To process the signal, they needed to represent the gesture as a sequence of discrete symbols. They decided to view the hand as a single dimensional sequence of symbols.They chose vector quantization as the algorithm to preprocess the data.

The vector quantizer encodes a vector by returning the index of a vector in the codebook that is closest to the vector. The preprocessor is coded as a data filter, so a symbol is sent to the recognition system as soon as enough data has been read to generate it.

After the data is preprocessed, it is evaluated by all HMMs for the recognition process, and is used to update the parameters of the proper HMM. They used 5-state Bakis HMMs, where a transition from a state can only go to that state or the 2 next states. They defined a confidence measure, where if the returned value is negative, the model is correct. if it is between -1 and 1, the classification maybe wrong. if it's less than -2.. then it is certain.

They tested on 14 letters of the alphabet in sign language. They didn't worry about measurements in 6D for the hand, so they chose the gestures that will not be confused with another one. The results show that their algorithm is reliable.

Discussion
I think their method is much better than the one I used in my project. However, I think they kind of took the easy way out in their testing by only the letters in the alphabet that would be easiest. I would have liked to see how it would work with all letters.

Real-Time RObust Body Part Tracking for Augmented Reality Interface

This paper seeks to provide an interface to track body parts without limiting the user's freedom. The system can recognize whether the user wears long sleeves or short sleeves. They use a calibrated camera to obtain images of hands, head, and feet. They transfer 2D detected body parts in an approximate 3D posture. Their algorithm is the following (pretty much copied here):
  • obtain a foreground image by deleting the background and shadow of the user from the original image
  • using the face texture, detect a face from the image.
  • extract contour features
  • tracks the users head using a particle filter
  • detect and tracks the two hands by segmenting the skin blob image
  • using the contour of the lower body, detect and track the feet and estimate the 3D body pose
  • extract meaningful gestures with the position of the right hand
  • visualizes all augmented objects and the user
They performed an experiment, evaluating 2D tracking performance with short and long sleeves, separately. They used an a BeNature system that recognizes simple gestures. They calculated the error when the user wears long sleeves is 5.48 pixels and 9.16 with short sleeves.

Analysis
I think this is quite unique method of tracking the human body, well, the hands, feet, and head. I like the robustness of the system in detecting whether the user is wearing long or short sleeves. However, I would like to see more user studies to see if this can be used for other purposes.

Liquids, Smoke, and Soap Bubbles - Reflections on Materials for Ephemeral User Interfaces

This paper looked at using bubbles as a form of human-computer interaction. They call this ephemeral user interfaces and their properties can be use for "novel playful and emotionally engaging interactions." The system included a round transparent tabletop surface with a diameter of about 20 inch and a thin layer of dark liquid on top. After blowing soap bubbles on the surface, the position and movement of the bubbles can be tracked by a camera. The bubbles leave a visible ring on the surface of the glass plate. A user can blow on the bubble or push it with the hand. They determine that this type of systems can be used in the home and entertainment. They imagine being able to use this in the form of "buttons on demand,"and ambient displays.

Analysis
I don't really see how this interface can be useful. It seems a bit too short term. Movement is a big hassle, and accuracy (blowing on the bubbles to move them) seems to be lacking. Also, the other method requires a bit of luck and finesse. I don't think this will be a good method of interaction

Thursday, May 13, 2010

Recent developments and Applications of Haptic Devices

Summary
Haptic feedback is becoming more prevalent in society as technology. He starts out by describing some vocabulary. Force feedback links the user to the computer by applying forces on the user. Tactile feedback is sensed by human receptors lying near the surface of the skins. Haptic feedback now is widely used to include tactile and force feedback. Degrees of freedom refers to the number of rotation and translations the device utilized. Actuators allows the device to exert the force on the user. Then he mentions the different types of devices that exist.
  • desktop devices
  • tactile devices
  • 2 and 3 degree of freedom desktop devices: includes many game accessories.
  • 5-7 degree of freedom desktop devices: pen devices
  • haptic feedback gloves
  • arm exoskeleton haptic devices
  • workbenches
  • human-scale haptic devices
  • motion platforms
  • locomotion interfaces
Discussion
These are a lot of haptic feedback devices. I never knew i was using a haptic device until i read this list, mainly the 2 and 3 DOF devices.
Summary
The paper does 2 studies to analyse the effect of design choices on the kinds of interactions performed and the effect on learning opportunities. One of the studies was with an interactive tabletop. There was a LED illuminated glass surface. Many plastic objects were used as input devices. There is a marker on the underside of the object. It is tracked when the tagged object is placed on the table. Different digital effects are generated on the screen when different objects are recognized. The next study is with Wiimotes. The goal of this study is to explore the use of tangible
"extertion interfaces" to understand the concepts of motion and acceleration through body-based interaction. As the Wiimote moves around, a different effect is generated. Technically, they don't have to point the mote towards the screen.. However, to see the visual feedback, they need to look at it.

The results show how the different interactions with the system influenced the kinds of learning opportunities promoted. The location of the representations was found to have a direct impact on where the children's focus is located and how they are aware of other actions. Observing and directing fellow peers also contribute to learning.

Analysis
I like the idea of tangible interfaces. I think we have read a paper dealing with that earlier in the semester. However, I like to see more applications of tangible interfaces in learning.

Eyedraw: Enabling Children with Severe Motor Impairments to Draw with their Eyes

Summary
The authors wanted to help children with disabilities to draw. They knew that although they are physically handicapped, they can still move their eyes around. Through the use of eye tracking technology, they devised two version of EyeDraw and test both non-disabled and disabled people in both studies. The first version had a minimal set of features: tools for drawing lines and circles, an undo button, a grid of dots to help the user dwell at a chosen location, and a facility to save and retrieve drawings. Version 2 was implemented after feedback from the first test. They also added new features to it. Also, users can stop the group of dots, so they can look around without accidentally issuing commands. Although both groups enjoyed using both versions, the disabled group had more success using the second one.

Analysis
I feel this is an interesting way to help handicapped people draw. However, my experiences with the eye tracker indicates it would not be very useful. Of course, I do not know how precise the eye tracker in the labs are. Also, it seems that after a while, the users will get tired or get a headache.

Coming to Grips with the Objects We Grasp

Summary
The authors wanted to create a wrist worn sensor that can read nearby RFID tags and the wearer's gestures in order to identify the interaction. They used a Porcupine sensor, an accelerometer-based module that allows power-efficient capturing of inertial data and a real-time clock and calendar chip. To read the RFID tags, the M!-mini from SkyeTek was chosen. They performed the box test to evaluate wrist-worn RFID antennas, and found out that a reading rate of 1 Hz balances capturing tags and saving power. They tested the sensing in an hour long gardening session. They also tested to see how long the battery would last. After charging it overnight, the longest continuous log lasted 18 hours, and the battery was never depleted. They estimate being able to run this for at least 2 days continuous.

Commentary
Although this seems like an interesting way to detect objects and interactions, it still seems awkward to wear the wrist sensors. Also, this would work if everything had a RFID sensor, which I'm not too sure of.

---------------------------------------------------
E. Berlin, J. Liu, K. van Laerhoven, and B. Schiele. Coming to Grips with the Objects We Grasp: Detecting Interactions with Efficient Wrist-Worn Sensors. 2010

Monday, May 10, 2010

Natural Gesture/Speech HCI

Summary
The authors in this paper examined hand gestures made by a weather person narrating in front of a weather map. Since the gestures are embedded in the narration, they have plenty of data from an uncontrolled environment to study the interaction between speech and gesture.

They implemented a continuous HMM based gesture recognition framework. First they classified gestures as either pointing, area, or contour gestures. Then they chose a parameter space. Since it requires capturing the hand motions, they performed color segmentation on the stream of video input. Then, they determine distance between the face with the hands and the angle from the vertical. They used two multivariate Gaussians to model the output probability distribution at every state of each HMM. In the testing, continuous gesture recognition has lower recognition rate than isolated recognition.

They also did a co-occurrence analysis of different gestures with spoken keywords. With this, they were able to improve the continuous gesture recognition result based on the analysis of gestures with keywords.

-------------------------------------------------------
Discussion
I think this is a good study into how speech and gestures are related. It sounds promising, but the limited case studies they did could be expanded upon. I think that this is still a proof of concept kind of thing, so there was no real user study.

The Wiimote with Multiple Sensor Bars

Summary
The authors wanted to develop a tracked virtual reality controller that can be used in low-budget non-technical scenarios. They used 5 wireless sensor bars with 2 IR sources each, a Wiimote controller, and a Nunchuk. They placed the bars vertically along a 2 display VR environment. Some of the constraints include having 2 IR sources visible at all times, 2 sensor bars are visible when moving from one bar to another, and no more than 4 sources are visible at one time. For each iteration:
  1. The software groups the tracked IR points to determine which pair is from the same bar. Then it creates a new dynamic model.
  2. It determines if it confirms or contradict the existing model.
  3. The software aligns the dynamic model with the static model using the previous alignment as a guide.
  4. The software derives the scaling function using the IR points and their corresponding screen coordinates
  5. It computes the cursor location.
They performed 3 tests. The first test was aimed a determining whether the visual disruption of the MSB array in front of the display would be distracting. The study consists of 12 participants and were given a navigation task of locating characters in response to audio cues and a manipulation task of stacking boxes on top of each other using the games physics engine. No one said the bars were distracting.

The second phase was aimed at establishing the usability of the complete MSB interface with a demanding FPS game.The authors wanted to see how often the game reset. 8 participants were in this study where they would complete two early levels from Half-Life 2. Some participants had trouble with vertical aiming and resetting.

The third phase involved moving the seat further back inside the coverage area. Some of the participants were from the second study and had high levels of reset errors. They were asked to replay the first level and a custom level where most of the enemies were at the horizon. The reset error rate dropped for all participants.

Discussion
I like this way of FPS interaction. The way they used the Wiimote was very intuitive, since it was like you're holding the gun. However, 2 handed guns may not feel right, and the other hand must be free for movement.

Sunday, May 9, 2010

The PepperMill

Summary
The authors of this paper sought to create a device that can be powered through the physical effort required to operate it. Their system consists of a small DC motor as a rotary input sensor that can create a temporary 3.3V power supply. The first stage of the circuit determines the direction of the input. The second stage rectifies the output of the motor via a diode bridge. The 3rd stage uses a pair of resistors as a voltage divider, reducing the variable output voltage to a level that can be directly sampled by an analog-to-digital converter in a micro controller. The final stage uses a 3.3 V low drop-out regulator to stabilize the variable voltage to a level that is readily usable by a micro controller.

They created a prototype device and called it the peppermill. When the user turns the knob, the micro controller powers up and samples the inputs from the supply circuit and the states of the 3 buttons. It transmit this as a single wireless packet. They tested with a simple video browsing and playback application. It is used similar to a remote control. They found the users like using the Peppermill. Some users would turn the knob too slowly, but they instinctively knew to turn faster until it works.

Discussion
I think this is a good idea. There are times where physically charging up the battery would be more helpful than an actual battery. However, I doubt there would be much use for this, since it seems only to work for appliances that don't require large amounts of power. Also, I would have liked to see a better user study than the one provided.

------------------------------------------------------------------------------
Nicolas Villar and Steve Hodges. The Peppermill: A Human-Powered User Interface Device. TEI 2010.

User-Defined Gestures for Surface COmputing

Summary
The authors wanted to see how showing the non-technical users the effect of a gesture before asking the user to perform the gesture. They took 1080 gestures from 20 participants and paired them with 27 commands performed with 1 and 2 hands. They found out that the users do not really care about the number of fingers they employ, 1 hand is preferred to two, desktop idioms strongly influence users' mental models, and there is a need for on screen widgets, since some commands are hard to come up with a common gesture. They use a Microsoft Surface prototype with a C# application to present recorded animations and speech.

Discussion
I think this is an interesting method of interaction. Although I don't know much about the Microsoft Surface, this looks promising. Also, I wonder how this study would look if they used children or people from other cultures.
------------------------------------------------
Jacob O. Wobbrock, Meredith Ringel Morris, ANdrew D. Wilson. User-Defined Gestures for Surface Computing. CHI 2009.

Whack Gestures

Summary
The authors introduce Whack Gestures in order for people to interact with devices with minimal attention and without taking the device out. They introduced a small vocabulary of gestures intended to interact with a small mobile device. To counter the possibility of false positives, the user must use a pair of whacks to frame the gesture to be recognized. There were 3 different gestures: whack-whack, whack-whack-whack, and whack-wiggle-whack.

They used a Mobile Sensor Platform, which is small enough to be attached to the waist. They tested their system with 11 users. They wore it for 2 hours each and then performed the 3 gestures 3 times each. The results from the test was a 97% true positive rate.

Discussion
I think this is an interesting way to solve the problem of interacting with a device without taking it out. However, whacking the device may not be silent enough. I wonder if the device is sensitive to tapping. It seems that if the point of this was to interact silently, tapping would be more silent than whacking.

--------------------------
Scott E. Hudson, Chris Harrison, Beverly Harrison, Anthony LaMarca. Whack Gestures: Inexact and Inattentive Interaction with Mobile Devices. TEI 2010.

Gameplay Issues in the Design of 3D Gestures for Video Games

Summary
The authors sought to identify points to be considered in the design of 3D gestures in space as a means of interacting with video games. They tested on 4 game scenarios
  • tilt: They used the game Neverball, where the goal is to send the ball to the exit by tilting a wireless controller.
  • Alarm: using data from the accelerometer, the alarm demonstrator will emit a loud ringing sound should an acceleration threshold be exceeded.
  • Heli: In a 2D helicopter game, the user must move the helicopter up and down to avoid boulders by shaking the controller.
  • Battle of the Wizards: The user uses the wireless controller to gesture runes in the air for offensive and defensive spells.
The test involved 2 people: 1 male with lots of gaming experience and one female with limited experience. The results showed that users like games with simple gestures so they can play immediately instead of learning the gestures.

Discussion
I think this is a decent look into some user interaction issues. I would have preferred a much more in depth user study. But I agree that simple gestures is much better overall for games in the marketplace, because not everyone will be willing to learn complex gestures, and by including only those gestures, a large segment of the market would be alienated.

-----------------------------------------------------------------
John Payne, et all. Gameplay Issues in the Design of 3D Gestures for Video Games. CHI 2006.

3D Gesture Recognition for Game Play Input

The authors sought to use gestures to provide intuitive and natural input mechanics for games. They created a game called Wiizard, where the goal is to cause as much damage to an enemy while taking a little damage.. The users cast spells by performing gestures. The gestures are placed in a queue, which is released when the user cast the spell. They created a user interface with a bar revealing the state of all gestures available to the user, the playing field and the queue for each player. The software uses a Wii controller, the gesture recognition system and a graphical game implementation. Each gesture is a collection of observation, and they use the accelerometer data from the Wii controller. They created a separate HMM for each gesture to be recognized. THe probability of a gesture is the distribution of the observations and the hidden states.

The user study consisted of 7 users performing the images from the game over 40 times each. 90% recognition with 10 states, over 93% with 15.

Discussion
I think this is a unique method for game interaction, which is similar to how our project is going. However, 40 times per gesture seems a bit excessive. Also, they said that the user interface had both queues on it. That may be confusing. I wonder how this would work with fewer training data.

----------------------------------------------------------
Louis Kratz, Matthew Smith, Frank J. Lee. Wizards:3D Gesture Recognition for Game Play Input. FuturePlay 2007.

Hidden Markov Models

Summary
The authors focused on identifying the necessary elements to use an HMM system regardless of the sensor device being used. It would work as long as it has information about the 3 axis of motion. In a hidden Markov model, a sequence is modeled as an output generated by a stochastic process progressing through discrete time steps, where a symbol from the alphabet is outputted at each step. Only the sequence of emitted symbols is observed. HMM requires a 1:1 ratio of state to alphabet. They carved their space into subcubes, where they got alphabet sizes of 27, 64, and 125. 27 was the chosen size since recognition time decreased and were able to achieve recognition of 800 gestures in a second. 250 samples in a training set is good for detection results.

In testing, the user would press a button, perform the gesture, and let go. less than 27 was considered short and greater than 27 was considered long. They determined that left and right hand data sets were different enough to throw off the results.

Discussion
I still don't know much about HMM. It would take a while to fully understand it. That's the reason why I didn't use them in my robotics project.
------------------------------------------------------
Anthony Whitehead, Kaitlyn Fox. Device Agnostic 3D Gesture Recognition using Hidden Markov Models. GDC Canada 2009.

Gesture based control in Multi-Robot Systems

Summary
The authors design a way to use hand gestures to control a multi-robot system. Hidden Markov Models are used to recognize gestures from the CyberGlove. There were 6 gestures, opening, opened, closing, pointing, waving left and waving right. They added states for each of these, plus a wait state. They also use a gesture spotter which selects the gesture that corresponds to the last state with the highest score, or the wait state.

They ran some tests on the HMM. Using codewords with gestures and non-gestures, the HMM with the wait state recognized gestures with 96% accuracy and 1.6 per 1000 false positives.

To control the robots, there are 2 modes of interaction. Local robot control allows the user to control the robot from the robot's POV. If the user points forward, it moves forward, regardless of orientation. Global robot control allows the user to point at where he wants the robot to go.

Discussion
This work is similar to what I am doing for my robotics project. I am focusing on single robot control and the global robot control described here. I only use 1 nearest neighbor to recognize gestures. Anyways, this paper is very interesting. I would like to have expanded my work into using HMM and a 6D tracking system so I can get more accurate readings for my project. However, HMMs were too difficult to learn and the Flock did not work.

------------------------------------------------------------
Soshi Iba, J. Michael Vande Weghe, Christiaan J. J. Paredis, and Pradeep K. Khosla. An Architecture for Gesture-Based Control of Mobile Robots

Saturday, May 8, 2010

HCI with Documents

Summary
The authors of this paper wanted to make a new interface for working with documents on a computer that would allow the user to be immersed. They accomplished this by allowing the users to interact naturally with gestures and postures and created a program that also allows users to teach the gestures to be recognized.

According to the paper, users that encounter environments that resemble the real world can use natural capabilities to remember spatial layout and to navigate in 3D environments, which allows them to multitask. The program had multiple visualization methods for the documents. In PlaneMode, users can type multiple search queries into several panels. Documents that match more search queries are moved closer to the user. Important documents pulse to catch the user's attention and the colors the documents are represented by indicate the category.

ClusterMode: The most relevant documents are moved to the front and center of the plane. The documents are clustered with like colors from the search queries. In one variation, the clusters are rings where the documents rotation. Two possible ways to connect the rings are to connect rings with one colors to rings that also contain that color (ie blue only, to blue and red) or to connect clusters that have the same colors except one additional color by a line of that color.

Relations between documents can by having semi transparent green boxes around related documents.

To interact with documents, users will use a P5 data glove to perform gestures which have an associated action to them. Since the data could be noisy due to the P5 cheapness, there needs to be a filter to make sure the data is accurate. Gestures need to be held from 300 to 800 milliseconds for it to be recognized. There is also the gesture manager which keeps track of known postures and the ability to manipulate the database.

Discussion
I think this is a unique way to interact with computer documents. However, the gestures do seem intuitive, but I guess even if they're not, the users can change the gestures to something they feel more comfortable with. Nevertheless, I think the idea is a good way to improve the way we organize files on the computer.
-------------------------------------------------------------------------------------
Andreas Dengel, Stefan Agne, Bertin Klein, Achim Ebert, Matthias Deller. Human-Centered Interaction with Documents. HCM'06.