CHI 436: September 2011

Wednesday, September 28, 2011

Paper Reading #13: Combining Multiple Depth Cameras and Projectors for Interactions On, Above, and Between Surfaces

References
Combining Multiple Depth Cameras and Projectors for Interactions On, Above, and Between Surfaces by Andrew D. Wilson and Hrvoje Benko. Published in UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology.

Author Bios

Andrew D. Wilson (Andy) holds a BA from Cornell University, and a MS and PhD from the MIT Media Laboratory. He currently works as a senior researcher for Microsoft, focusing on applying sensing techniques to enable new styles of HCI.
Hrvoji Benko is currently a researcher at Adaptive Systems and Interaction at Microsoft Research. He holds a PhD from Columbia University for work on augmented reality projects that combine immersive experiences with interactive tabletops.

Summary
Hypothesis
The LightSpace system can overcome current limitations of physical confinement and allow for more in-depth interaction through the space of an entire room.
Methods
A LightSpace prototype was showcased at a convention over the course of three days. During this time several hundred people had access to it, and were given leave to explore it as they wished. The researchers observed and recorded observations about the interactions.
Results
The researchers noted that, although there is no technical limit to the number of people who could use a single room at a time, the practical cap is 6 people. Realistically, even 2 or 3 can slow down processing significantly, and as more people become involved it gets harder to actually discern between them. In addition, although users had no trouble with basic interaction and manipulation, it took most of them some practice to be able to pick up and 'hold' objects.
Contents

The authors present LightSpace, an interactive system that takes essentially takes the concept of a touch screen and applies it over an entire room. They emphasize several functional themes such as "Surface everywhere: all physical surfaces should be interactive displays", "The room is the computer", and "Body as display: graphics may be projected onto the user’s body to enable interactions in mid-air". LightSpace supports four interactions, namely Simulated Interactive Surfaces, Picking Up Objects, Through-Body Transitions Between Surfaces, and Spatial Menus.

Discussion
Overall, I was very impressed with the idea and construction of the project and I think that they did a very good and thorough job of research and implementation. However, I am disappointed by the lack of user-study in this paper. It may be that the authors wanted to focus on the features and design of the system itself more than the testing parts simply because they hadn't had time to do a lot of testing yet. However, it would have been beneficial to have more of a breakdown of user reactions to some of the specific elements mentioned earlier in the paper.

Monday, September 26, 2011

Paper Reading #12: Enabling Beyond-Surface Interactions for Interactive Surface with an Invisible Projection

References:
Enabling Beyond-Surface Interactions for Interactive Surface with an Invisible Projection by Li-Wei Chan, Hsiang-Tao Wu, Hui-Shan Kao, Ju-Chun Ko, Home-Ru Lin, Mike Y. Chen, Jane Hsu, Yi-Ping Hung. Published in the UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology.

Author Bios:
Li-Wei Chan is currently a PhD student at the National Taiwan University. He holds a Masters degree in Computer Science from the same university, and a bachelors from Fu Jen Catholic University.
Hsiang-Tao Wu, Hui-Shan Kao, and Home-Ru Lin are students at the National Taiwan University.
Ju-Chun Ko is also a PhD student at the National Taiwan University in the Computer and Information Networking Center.
Mike Chen is a profession in the department of computer science at the National Taiwan University.
Jane Hsu is a professor of Computer Science and Information Engineering at National Taiwan University.
Yi-Ping Hung is a professor in the Graduate Institute of Networking and Multimedia at National Taiwan University. He also holds a Master's and PhD from Brown University.

Summary
Hypothesis
Using a programmable infrared technique it is possible to support interaction with a system beyond the simple surface of the display.
Methods
The tabletop system design consists of several components. There is an invisible light projector with allows them to display invisible content for use in realizing 3D localization. The projector was converted to infrared from a standard DLP projector. Another component is the table surface, which consists of both a diffuse layer and a glass layer. The diffuser layer is placed on top of the touch-glass layer in order to obtain the best quality projections from the table. Additionally, there are several techniques used to enhance the functionality. For example, the printed markers cannot adapt to the 3D positions of the cameras, so the idea is to adapt the marker size to the observing positions of the cameras so that they can see markers of optimal size during interaction.
Results
Users were quick to note that they could only see the bottom part of a building and wanted to be able to lift the view to see the upper parts. However, the i-m-View would get lost in the space above the table system. Some users would flip the i-m-View on end to obtain a portrait view of the map scene, but this was not actually supported by the system. Overall, users reported positive feedback regarding the i-m-Lamp, which was the more stationary fixture. However, some users reported that they would like to be able to use the i-m-Flashlight as a sort of remote mouse with the ability to drag the map directly.
Contents
The paper set out to present a tabletop display that gave users a unique 3D level of interaction with the system. Using an infrared projector they were able to display the image and track user interaction.

Discussion
In my opinion, the authors of the paper did manage to pull off a very good proof of concept as far as demonstrating the 3D infrared interaction. There were some shortcomings with the overall system, such as the mapping getting 'lost' or the i-r-flashlight not behaving entirely to the users' satisfaction, but these are relatively minor compared to the interest of the overall product that resulted. I hope that some sort of at-home version of this technology becomes available in the near future, because I think it could have application in all sorts of fields. Medicine, education, entertainment, and many other areas could benefit from it.

Paper Reading #11:Multitoe: High-Precision Interaction with Back-Projected Floors Based on High-Resolution Multi-Touch Input

References: Multitoe: High-Precision Interaction with Back-Projected Floors Based on High-Resolution Multi-Touch Input by Thomas Augsten, Konstantin Kaefer, Rene Meusel, Caroline Fetzer, Dorian Kanitz, Thomas Stoff, Torsten Becker, Christian Holz, and Patrick Baudisch. Published in UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology.
Author Bios:
Thomas Augsten and Konstantin Kaefer are currently working towards a Masters degree in IT Systems at the University of Potsdamn in Germany. Christian Holz is a PhD student in HCI, also at the University of Potsdamn.
Patrick Baudisch is a professor in Computer Science at the Hasso Plattner Institute, and Rene Meusel, Caroline Fetzer, Dorian Kanitz, Thomas Stoff, and Torsten Becker are all students at the Hasso Plattner Institute.
Summary
Hypothesis
Current touchscreen interfaces and devices are limited in size and, therefore, in content. The authors believe that a solution lies in creating an interface with a larger surface area and using feet as the interaction agents.
Methods
The first part of the study was designed to determine how users interact with the floor and the best method for distinguishing between intentional user action and regular walking or standing. The participants were told to activate "buttons" and their methods and techniques were observed and recorded. The activation techniques were also applied to the idea of invoking a menu. The next user study was to understand the idea of stepping and how users could select controls. Participants stepped onto a floor with a honeycomb grid and asked to state which honeycomb 'buttons' should be depressed based on their foot position. The third part of the study was to determine how users would perceive a "hotspot" of limited area. Participants were asked to placed their foot onto the crosshairs such that the foot's hotspot was located directly on target. Each participant had four different conditions on what area of the foot should act as the hot spot: "free choice", the ball of the heel, the tip of the shoe, and the big toe. The last user study was designed to determine the lower bound on the size of object that a user can interact with. Participants had to type out a few words using their feet on various keyboard set-ups.
Results
For the first part of the study, the authors found that even though there were a broad variety of techniques employed by the participants, the four most useful strategies for activating a button were tapping. stomping, jumping, and double tapping. The authors determined that jumping was the most easily recognized method for invoking a menu. For the second part of the study, they found that 18 of the 20 users felt that the entire area under the shoe, including the arch, should be included in the selection. Only two users felt that the arch should be excluded. Results varied when considering the cells around the outline of the shoe. For the third part of the study the researchers found that there was substantial disagreement between where users perceived the hot spot should be in the free choice condition. In the final study they found, as expected, that error rate increases with decreasing button sizes.
Contents
This paper focuses on the research and development of an intuitive interaction with touch based technology through people's feet. It explores some of the particular points of interaction, such as how a user thinks they should be able to "point and click" and where the interface has parallels in current touch screen technologies. It also experiments with how users move and how they perceive their own feet.
Discussion
I believe that the authors managed to achieve the goals outlined, as far as researching people's interactions with their feet and applying it to touch technology. I personally found it very fascinating, but I'm afraid it won't be applicable in much except for gaming. I don't see that this type of interaction provides any sort of gain in productivity or effective communication. However, I do think that it could be applied to some extremely interesting games and even exercise applications. Perhaps there would be some market in rehabilitation areas, but it seems like most of the value would come from entertainment.

Thursday, September 22, 2011

Paper Reading #10 : Sensing foot gestures from the pocket

References: Sensing foot gestures from the pocket by Jeremy Scott, David Dearman, Koji Yatani, and Khai N. Truong. Published in the UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology.
Author Bios: Jeremy Scott received his B.Sc., M.Sc., and Ph.D. in Pharmacology & Toxicology from the University of Western Ontario. Dr. Scott is currently part of the Faculty of Medicine at the University of Toronto. David Dearman is currently a PhD student at the University of Toronto in the Department of Computer Science. His research bridges HCI, Ubiquitous Computing, and Mobile Computing. Koji Yatani is a PhD candidate in the University of Toronto under Professor Khai N. Truong. He is interested in HCI and ubiquitous computing with emphasis on hardware and sensing technology. Khai N. Truong is an assistant professor in computer science at the University of Toronto. He holds a Bachelor of Science degree in Computer Engineering from the School of Electrical and Computer Engineering at the Georgia Institute of Technology.
Summary
Hypothesis
Foot motion can be used as an effective alternative to traditional hand motions for computer input.
Methods
The authors first wanted to study how people used their feet in selection tasks and find out what motions were easiest and most effective for users. Participants were asked to perform a target selection task with their foot in three different motions: dorsiflexion, plantar flexion, and heel & toe rotation. They used six motion capture cameras to log the movements of the foot, and participants were given a wireless mouse that they used to indicate the start and end of a selection and to respond to the experiment software’s prompts.
Results

Users were able to select targets more quickly when located near the center of range of motion than when they were on the outer edge of the range of motion. In all exercises except dorsiflection, participants tended to overshoot the small angular targets more than the large angular targets. By contrast, participants tended to significantly undershoot targets in the dorsiflection trials. Participants identified heel rotation as the most comfortable.
Contents
The paper begins with an overview of their goal to find out if feet really can be an acceptable and useful alternative to hands, and then they begin discussing the experiment to determine comfortable range of motion. Once they summarized all of the results mentioned above, they discuss some of the limitations such as distinguishing between deliberate and accidental motions. In the future they plan to implement a classifier on a mobile device and build a real-time foot gesture recognition system. We are also interested in examining the performance of foot gesture recognition and the acceptability of these foot gestures in more naturalistic settings.
Discussion
This was an intriguing topic, and I appreciate the out-of-the box thinking. I feel that they did a good job researching how people use their feet and determining exactly how feasible this physical motion would be. On the other hand, I think it would be very difficult to market something like this for two reasons: First, people are already used to using their hands, and it can be extremely difficult for people to accept new ideas. Second, there is not really anything new to be gained through this method of input, except perhaps for people with special needs.

Tuesday, September 20, 2011

Book Reading #1: Gang Leader For A Day

I really liked this book. Although the subject matter was not overwhelmingly interesting to me, it was written well enough to be engaging and interesting. I feel like I was able to learn a little bit and gain some appreciation for how truly difficult the situation in the projects can be, although I cannot say that it had an overwhelming impact on my outlook.
I know that J.T. is really the only character that he follows from the beginning of the book to the end, so it is unfair to say that I find him the most interesting of the characters. However, he strikes me as someone who understands that he can change his situation if he is smart, lucky, and knows what he wants. Even from the beginning he had plans, and they didn't involve being stuck in the slums forever. I really would have appreciated it if Sudhir had actually made a sort of biography about J.T. because I think he had a really good grasp on life and how the world works. True, his money handling skills were not exactly great, but he had the brains and drive to try and make a way for himself.
As for the rest of the folks in the projects, they could stand to learn a few things from J.T. about how to get what you want in the world. Unfortunately for most of them, they don't usually have that kind of maturity and understanding until they are too old and stuck to do anything about it. Trouble and hardships come early in life in the projects, so it is easy to adopt it as a permanent way of life that cannot be changed. I suppose the same is true about any sort of culture or lifestyle, and it takes a certain amount of luck early in life to keep from getting completely trapped before you are old enough to make your own way.
On a slightly unrelated note, while I appreciate the whole idea of trying not to change the people you're studying, I also find it slightly amusing that it is well accepted that this principle is impossible to achieve in practice. Rather than fighting a losing battle, it would probably be more effective to agree to open communication all the time and make it more of a simple, relaxed interaction. I think it would be more interesting and beneficial if both sides can discuss why they do things the way that they do and, when appropriate, adopt those practices which really do make sense.

Sunday, September 18, 2011

Paper Reading #9: Jogging Over a Distance Between Europe and Australia

References: Jogging Over a Distance Between Europe and Australia by Florian Mueller, Frank Vetere, Martin Gibbs, Darren Edge, Stefan Agamanolis, Jennifer Sheridan. Published UIST '10 Proceedings of the 23rd annual ACM symposium on User interface software and technology
Author Bios: Florian Mueller is currently researching interactive technology as a Fullbright Visiting Scholar at Stanford University. He conducted his PhD research in the Interactive Design group in the Department of Information Systems at the University of Melbourne. Frank Vetere is a senior lecturer in the Department of Information Systems at the University of Melbourne. His research interests are in HCI and Interaction Design and he works with colleagues in the Interaction Design Group to investigate the role and use of emerging Information and Communication Technologies. Martin Gibbs is also a lecturer in the Department of Information Systems at the University of Melbourne. He is currently investigating how people use a variety of interactive technologies (video games, community networks, mobile phones, etc.) for convivial and sociable purposes in a variety of situations. Darren Edge is currently a researcher in the HCI group for Microsoft Research. He spent seven years at the University of Cambridge, earning both an undergraduate degree and a PhD. Stefan Agamanolis works mainly in the area of digital media and communication technologies and is currently Associate Director of the Rebecca D. Considine Research Institute at Akron Children's Hospital. He holds an MS and PhD degrees in Media Arts and Sciences from the Massachusetts Institute of Technology, as well as Bachelor of Arts degree in computer science from Oberlin College. Jennifer Sheridan is currently the Senior User Experience Consultant and Director of User Experience at BigDog Interactive. She holds a PhD in Computer Science from Lancaster University.
Summary:
Hypothesis
Jogging and other physical activities can be made more enjoyable with the ability to communicate with other people anywhere in the world.
Methods
Participants engaged in jogging runs that lasted from 25 to 45 minutes, and communicated with a friend for the duration. Afterwards participants were interviewed for up to two hours and asked to respond to very open-ended questions about their experience. There were a total of 17 participants who provided data over a total of 14 runs.
Results
Initial findings indicated that Jogging Over a Distance can facilitate a social experience. Joggers appreciated the function of volume change to indicate relative position but were also glad that the volume never diminished to a point that actually hindered communication. Their results also agreed with idea that social presence can benefit from spatial properties, and the emergence of a virtual spatial environment contributed to the social aspect of the activity.
Contents
In this paper the authors state that jogging can be a good social activity, but that sometimes people who would like to jog and chat are restricted by their physical distance from each other. The author notes that physical activity can provide an excellent means of bonding and developing relationships, so it would be useful for people to be able to communicate with each other while they are exercising. It is important to note that the paper focuses more on how to communicate and make the exercise enjoyable, not on how to improve actual physical performance.
Discussion
I think the authors managed to achieve their goal of expanding the current knowledge of social interaction and how it can related to physical activity. I found their results to be quite interesting, particularly the area where they discuss how the spatial aspect affects the interaction. I think the paper is significant because it contributes to the growing (and very important) field of merging our technologies with physical activity. It is becoming increasingly important to motivate people to get more exercise in daily life, and this kind of research is exactly what we need. This particular product could even be expanded upon to encompass a more competitive aspect, as well as being applied to other activities. It would be interesting, for example, to be able to talk to someone when you are skateboarding and practicing tricks.

Wednesday, September 14, 2011

Paper Reading #8: Gesture Search: A tool for fast mobile data access

Reference Information:
Gesture Search: A tool for fast mobile data access by Yang Li. Published in the UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology.

Author Bio: Yang Li is currently a Senior Research Scientist working for Google. He spent time at the University of Washington as a research associate in computer science and engineering. He holds a PhD in Computer Science from the Chinese Academy of Sciences.

Summary
Hypothesis
Yang Li presents several individual hypotheses to test under the primary goal of demonstrating that Gesture Search is a superior tool for accessing data in a mobile device. One specific hypothesis is that GUI-oriented touch input should have less variation in its trajectory than gestures.
Another hypothesis presented was that Gesture Search would provide a quick and less stressful way for users to access mobile data.
Methods
To test the first hypothesis, he collected GUI events on touch screen devices and compared them against gesture data. Specifically, he asked participants to perform a set of GUI interaction tasks on an Android phone, with the instruction to do it just as they normally would.
For the second hypothesis he made Gesture Search available for download through the company's internal website and asked Android users to test it out and provide feedback after using it for a while. This was not a controlled laboratory study and users were encouraged to use Gesture Search in whatever capacity was most applicable to their daily life.
Results
From the first study, the results were as expected: GUI touch input had trajectories with much less variation that gestures. There was one exception to this observation which occurred for GUI manipulations such a scrolling, flicking, and panning. These input commands required a larger bounding box than most other touch inputs.
The second study provided data for over 5,000 search sessions, and showed that 84% of searches involved only one gesture, and 98% had two gestures or less.
Contents
Yang Li introduces an application called Gesture Search, which is designed to allow users easier and faster access to elements in their mobile device by reading gestures drawn on the screen. In particular, he focuses on in application in areas such as searching for a contact or tool. The application is ideally supposed to recognize gestures that take the form of letters and search within the device for information that matches the gesture. He noted that there was some ambiguity in distinguishing between the taps and commands associated with preexisting GUI controls, particularly those that involve any sort of dragging motion. He solves this problem through two techniques: allowing a slight time window to capture whenever a tap is actually part of a larger letter, and a test to determine the general area and shape of the gesture.
Having developed an application that seemed to pass lab testing, he deployed it to a number of Android users and studied the interactions. He also collected feedback after a period of time to determine how well it was received. Overall, user reactions were positive.

Discussion
I would like to start by saying that I don't have a "smart" phone, such as an Android or iPhone, so I am not particularly well-qualified to comment on the potential use and effectiveness of this application. Having said that, while I think he did a good job developing and presenting his product, I also feel that he is trying to solve a problem that isn't really a problem. From what I have seen, the user interfaces and controls on the phones are already quite intuitive and easy, and it might be more work for users to remember to use the gesture search in the first place. There was one point in the study that I didn't like very much, and that was when he stated that user data from people who barely used the application at all was thrown out. I think it would have been better if he had at least made a point of finding out why they didn't use it. Perhaps in the future this sort of "gesture searching" technology could partner with the swipe keyboard idea in the previous paper for an entirely new, completely finger-squiggle driven interface.

Paper Reading #7: Performance Optimizations of Virtual Keyboards for Stroke-Based Text Entry on a Touch-Based Tabletop

Reference Information:Performance Optimizations of Virtual Keyboards for Stroke-Based Text Entry on a Touch-Based Tabletop by Jochen Rick. Published in the UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology.
Author Bios:
Jochen Rick is currently a faculty member in the Department of Educational Technology at Saarland University. He holds a PhD in Computer Science from Georgia Tech and spent 3 years as a research fellow at the Open University working on the ShareIT project.

Summary
Hypothesis
Keyboard layout plays a large role in the effectiveness of shape-writing, or stroke-based text entry.
Methods
A group of participants was asked to stroke through a series of four points. The author collected the data for the time taken, distance, and angle between each point. He then applied this data to an algorithm to determine optimum placement for letter in a layout based on frequency and order of use.
Results
After applying his mathematical model to a number of existing layouts, he found that wide-set keyboard layouts like Dvorak and Qwerty performed very poorly as swipe-text entry. As he pointed out, however, this is to be expected as they were designed with a very different usage in mind. On the other hand, he presented two optimized layouts, Hexagon OSK and Square OSK, which were markedly faster than their already-existing counterparts.

Contents
The author begins by stating that there is a need to develop better layouts for stroke based text entry. He describes the history of keyboard layouts and goes into detail about the efficiency and reasoning behind some of the more popular ones. He then conducted a study on how users make use of touch-screen technology when entering text and applied his findings to a mathematical formula to determine an optimum layout for letters in a stroke-based entry system. The two keyboards he came up with show an improvement in efficiency over all currently existing layouts.
Discussion

I found this article extremely interesting and I am very convinced by his findings. If I had one disappointment, it is that he did not actually create and test his new key layouts on users. From what I understand, his calculations of efficiency and effectiveness are based purely on simulations. I think that this stroke-based text entry system may prove to be a better solution to touch screen text entry than traditional finger-tapping and I am interested to see its progress.

Monday, September 12, 2011

Paper Reading #6: TurKit: human computation algorithms on mechanical turk

Reference Information:
TurKit: human computation algorithms on mechanical turk By Greg Little, Lydia B. Chilton, Max Goldman, Robert C. Miller. Published in the UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology
Author Bios:
Greg Little attended Arizona State University for a short time before dropping out to work with a video game company. He eventually went back to ASU and graduated, then was accepted into the PhD program at MIT. He is currently part of the User Interface Design group under Rob Miller. Lydia Chilton is currently a computer science graduate student at the University of Washington. She also attended MIT from 2002 to 2009. Max Goldman is a graduate student at MIT studying user interfaces and software development. He also spent time at the Israel Institute of Technology. Robert C. Miller is an associate professor in the EECS department at MIT and leads the User Interface Design Group. His research interests include web automation and customization, automated text editing, end-user programming, usable security, and other issues in HCI.
Summary
Hypothesis
The authors believe that their program TurKit, a toolkit for prototyping and exploring algorithmic human computation, can expand on the efficiency and effectiveness of Mechanical Turk.
Methods
The paper describes several examples of possible applications for their toolkit. The first example described is iterative writing. Basically, one turker writes a paragraph with a goal, and subsequent turkers try to improve upon the paragraph. In between iterations the paragraph is subjected to a 'cleaning' to remove parts that are not relevant or useful. Another iterative task presented as an example was recognizing blurry text. Over several iterations, new guesses as to the blurred text's meaning are added and changed. The paper also explores decision theory experimentation using TurKit to simulate human decision making in a random guessing scenario, as well as psychophysics experimentation to have turkers sort and classify various stimuli in an effort to determine salient dimensions among those stimuli.
Results
From the first example, the authors noted that most paragraph improvements involve making the paragraph longer. Additionally, people tend to keep to the original style and formatting. In the blurry text recognition, the guesses evolved over several iterations and the final result was nearly perfect. In the decision theory experimentation TurKit was useful in coordinating the iterative nature of the process, but not necessarily very good at simulating actual human behavior. And lastly, TurKit has proven useful and effective in the area of psychophysics experimentation, since calls to MTurk were embedded within a larger application. Overall, the TurKit crash-and-rerun programming model made it easy to write simple scripts, but was far from perfect. Users were often unclear about certain critical details and shortcomings of TurKit, as well as not knowing about the parallel features.
Contents
The paper introduces us to TurKit, a toolkit that is good for prototyping algorithmic tasks on MTurk. It offers up several concepts and tools, including the idea of 'crash-and-rerun programming'. This is a programming model suited to long running processes where local computation is cheap and remote work is costly. It has the benefits of allowing incremental programming, easy implementation, and retroactive print-line-debugging. The paper goes on to describe certain highlights of TurKit in detail, such as the TurKit script and the web interface. It also gives several example applications with real-world feedback and discusses some of the user reactions to the tools.
Discussion
I am impressed with the contents of this paper, but I cannot say that I fully understand everything. It sounds like they have created a very useful tool with a lot of potential to grow and become even better. However, I personally have never dealt with Mechanical Turk and I think that makes it hard for me to really appreciate what exactly the authors have accomplished. Based on what I read, it seems like they needed to work on communicating with the users a little bit better regarding how to use the TurKit and were it might trip them up. There were several people who were unaware of the parallel aspect of the program, and the authors mentioned that people expressed some concern over potential problems and dangers that were not immediately obvious.

Wednesday, September 7, 2011

Paper Reading #5: A Framework for Robust and Flexible Handling of Inputs with Uncertainty

References:
A Framework for Robust and Flexible Handling of Inputs with Uncertainty by Julia Schwarz, Scott E. Hudson, Jennifer Mankoff, and Andrew D. Wilson. Published in the UIST 2010 proceedings in New York, New York.
Author Bios:

Julia Schwarz is a PhD student at Carnegie Mellon University studying Human-Computer Interaction. Her research interests are natural interaction and handling ambiguous input. Scott Hudson is currently a professor in the HCI Institute at Carnegie Mellon Unversity. He holds a PhD in Computer Science from the University of Colorado. Jennifer Mankoff is an Associate Professor in the HCI Institute at Carnegie Mellon University. She holds a PhD in Computer Science from the Georgia Institute of Technology. Andy Wilson holds a BA from Cornell University, and a MS and PhD from the MIT Media Laboratory. He currently works as a senior researcher for Microsoft, focusing on applying sensing techniques to enable new styles of HCI.
Summary:
Hypothesis
This paper did not have a hypothesis per se, but aimed to present a framework for handling input with uncertainty in a systematic, extensible, and easy to manipulate fashion.
Methods
To illustrate the framework, the authors present six demonstrations including tiny buttons manipulated by touch, a textbox that handles spoken input, a scrollbar that responds to inexact input, and buttons designed to be easier for those with motor impairments. The first three demonstrations focused on the ambiguity in determining what the user intends to interact with. This was shown through window resizing, ambiguous and remote sliders, and tiny buttons for touch input. Actual determination of intent was based on probability and generalized location of the input. The next two examples focused how text might be entered into a form through either smart text delivery or speech. Both examples deliver text to a form with multiple fields and divide the process into three phases. The first phase, selection, occurs as the text boxes return selection scores based on the match between the incoming characters and its own regular expression that determines what sort of input it is looking for. The second stage, temporary action, shows the text in gray to indicate that it is not yet finalized. The third and final stage, finalization, occurs if a users clicks on the textbox to explicitly disambiguate which field should be receiving the text or if the text in the textbox matches some finalization criteria in the regular expression. The sixth and final example, improved GUI pointing for the motor impaired, aimed to treat the input in a way that handles its true uncertainty well and thereby increase accuracy of user interactions. This was done by gathering data on real-world interaction of motor-impaired participants and then simulating how the clicks would be interpreted with the uncertainty compensation of the system.
Results
The first set of demonstrations shows that it was easy to adapt to the flexibility of the interface. There is ambiguity in determining what the user intends to interact with, but the system helps handle the process of deciding. The authors say that their framework has the potential to enable entirely new forms of interaction, which can adjust their responses based on how likely they are to be pressed. The second set of demonstrations showed that the framework was very capable of handling multiple interpretations of alternative events with little to no extra development. Finally, the last demonstration proved that the framework was capable of handling the input robustly, missing only two cases out of over 400.
Contents
This paper begins by expressing the idea that the conventional infrastructure for handling input lacks the ability to manage uncertainty, and goes on to propose a framework that solves this problem. They describe several experiments designed to prove the robustness and usability of the framework, such as how it can handle touch input over tiny buttons and deal with various text entry modes. Each of the examples shown prove to be successful demonstrations of the capabilities described.
Discussion:
I feel that the paper presented an interesting idea and, by its own contents, a completely successful resulting product. I completely agree that there is huge room for development in translating imprecise human input into something that a computer can understand. However, I was disappointed that this paper made no real note of actual user trials, aside from the data gathered for the motor-impaired GUI. It is impossible to know if, in fact, they did have normal people test any of the software, but I found its absence in the paper disappointing. I was also a little bit confused as to why they pursued the extended GUI for the motor-impaired. While I agree that it is definitely something that would have to be considered, it seemed out of place in the general midst of the rest of the demonstrations.

Paper Reading #4: Gestalt: integrated support for implementation and analysis in machine learning

References:
Gestalt: integrated support for implementation and analysis in machine learning by Kayur Patel, Naomi Bancroft, Steven M. Drucker, James Fogarty, Andrew J. Ko, James A. Landay. Published in the UIST 2010 Proceedings of the 23 Annual ACM symposium on user interface software and technology.
Author Bios:Kayur Patel is currently pursuing a PhD in Computer Science from Washington University. His studies focus on machine learning algorithms where behavior is learned from data rather than specified in code. Naomi Bancroft recently graduated from Washington University with degrees in Computer Science and Linguistics. She is currently employed by Google in Mountain View, CA. Steven Drucker is a principal researcher at Microsoft Research focusing on human computer interaction when dealing with large amounts of information. He holds a PhD from the Computer Graphics and Animation group from MIT Media Labs. James Fogarty is an assistant professor of computer science and engineering at the University of Washington. He holds a PhD from Human-Computer Interaction institute at Carnegie Mellon University. Andrew Ko is an assistant professor at the information school at the University of Washington. The goal of his research is for software evolution to be driven more by human need and less by technological constraints. James Landay is a professor of computer science and engineering at the University of Washington. He holds a PhD in Computer Science from Carnegie Mellon University.
Summary:
Hypothesis
The authors hypothesized that a structured representation would be most useful when developers first started a project. They wished to show that it can significantly improve the ability of developers to find and fix bugs in machine learning systems.
Methods
The participants were asked to use a general purpose development environment to create, edit, and execute scripts. They were given an API that could be used to reproduce all of Gestalt's visualizations. The baseline condition and Gestalt used the same data table structure to store data, but in the baseline participants had to write code to connect data, attributes, and classifications. Participants were asked to build solutions for two problems, sentiment analysis and gesture recognition. The solutions had five bugs built into them and the participants were supposed to locate and fix as many bugs as possible within an hour.
Results
Participants unanimously preferred using Gestalt to the baseline, and were able to find and fix more bugs using Gestalt. The authors noted a marginal effect of trial on the number of bugs found and fixed, as participants found more in the second trial.
Discussion:
I found the paper to be very effective at communicating what it set out to demonstrate, and intriguing as well. A good development environment and effective tools can make a huge difference in the efficiency of writing and debugging code, and I think that it was interesting for the authors to target a more effective process rather than just focusing on the output. There were limitations in the study, but the authors made a point of listing out some of the biggest ones at the conclusion of the article. In particular, I noticed that although they state that a structured representation is probably more useful when developers first start a project, their study consisted of putting people in front of code that had already been written. While not exactly counter-productive to their argument, it did lead a little bit away from the original path.

Monday, September 5, 2011

Paper Reading #3: Pen + Touch = New Tools

References: Pen + Touch = New Tools by Ken Hinkley, Koji Yatani, Michel Pahud, Nicole Coddington, Jenny Rodenhouse, Andy Wilson, Hrvoje Benko, and Bill Buxton. Microsofy Research, One Microsoft Way, Redmond WA 98052. Presented in UIST 2010 on October 3rd - 6th in New York.

Author Bios:
Ken Hinkley received his PhD at the University of Virginia and is currently a Principal Researcher at Microsoft Research. The primary focus of his research is to enhance the input vocabulary that one can express using common computational devices and user interfaces. Koji Yatani is a PhD candidate at the University of Toronto. His primary research interest lies in HCI and ubiquitous computing, with an emphasis on hardware and sensing technologies. Michel Pahud holds a PhD in parallel computing from the Swiss Federal Institute of Technology and currently works at Microsoft Research in the area of user experience. More recently he has been working on innovative distributed collaboration experiences, smartphone exploration, and collaborating on pen and touch experiences with Ken Hinkley. Nicole Coddington holds a bachelors degree in visual communication from the University of Florida and spent several years working as a designer for Microsoft. She currently is employed as the senior interaction designer for HTC. Jenny Rodenhouse has a bachelors in Industrial Design from Syracuse University. She currently is an experience designer in the interactive environment division at Microsoft. Andy Wilson holds a BA from Cornell University, and a MS and PhD from the MIT Media Laboratory. He currently works as a senior researcher for Microsoft, focusing on applying sensing techniques to enable new styles of HCI.
Hrvoji Benko is currently a researcher at Adaptive Systems and Interaction at Microsoft Research. He holds a PhD from Columbia University for work on augmented reality projects that combine immersive experiences with interactive tabletops. Bill Buxton holds a Bachelor of Music Degree from Queen's University, and it was his work on electronic instruments that led him to the field of HCI. He currently works as a Principal Researcher at Microsoft Research.
Summary:
Hypothesis
The researchers advocated a "division of labor" between the writing utensil and human touch when operating a digital writing system. Specifically, they hypothesized that a logical interface modeled after human behavior would make the pen exclusively responsible for writing, touch responsible for manipulation, and a combination of both should be used for the generation of new tools.
Methods
The researchers first performed an experiment to observe how people manipulate tools such as paper, pen, and a design surface when creating a craft. They told participants to put together a paper notebook consisting of clippings pasted into a notebook and observed how the participants used their hands, the pen, and the clippings in a collection of gestures and techniques for efficiency that we don't often think about. Later in the paper the authors describe how they applied their observations and integrated them into the system to be tried out by testers. The participants were asked to experiment with the resulting product and provide feedback about its usability.
Results
Users showed a positive reaction to the combined pen and touch interaction. Although it was noted that not all of the gestures and commands would be obvious without an explanation, they were reasonably intuitive and easy to remember and use. Also, although the system does not fully solve the problem of incidental touch, such as laying the hand on the page, it does mitigate the effects to some extent. In addition, participants noted that the concept of an object underlies many of the gestures, and that many of the gestures worked the same as they had when making the notebook.
Contents:
The paper goes into great detail about the interaction between a person's hand and pen when manipulating a design. It begins by describing some of the basic behaviors noted when users were creating the notebook, such as tucking the pen in their fingers and organizing their workspace around themselves. It then goes on to explore how these observations might be applied in the product, such as which hand is preferred for certain tasks and how the pen+touch system might support a either stationary or mobile usage. Some of the tools developed in the system were the stapler for grouping items into a stack, the x-acto knife for cutting or tearing items, the carbon copy, the ruler as a straightedge, and others. Touch was used in various manipulations such as navigating, zooming, and page arrangement.
Discussion:
I thought the paper was very effective in its description, approach, and execution of the problem. I was impressed with their tactic of observing how people interact naturally and then applying the observations to a design, and found the level of detail very satisfying. The paper is interesting because it is very easy to see how this system could be adopted almost seamlessly into daily life. However, I also feel that this isn't particularly novel and could be seen as reinventing the wheel. We have reasonably effective tablet PCs already, and the iPad has proven exceptionally popular in the past year or two. Truthfully, this niche in the market already has competition and there isn't much real need for an entirely new system no matter how well thought-out or cool it may be.

Sunday, September 4, 2011

Paper Reading #2: Hands-On Math: A page-based multi-touch and pen desktop for technical work and problem solving

References: Hands-On Math: A page-based multi-touch and pen desktop for technical work and problem solving by Robert Zeleznik, Andrew Bragdon, Ferdi Adeputra, Hsu-Sheng Ko. This paper was presented in the UIST 2010 Proceedings of the 23 annual ACM symposium on User interface software and technology.
Author Bios:
Robert Zeleznik has a Bachelors and Masters Degree in Computer Science from Brown University. He currently is a director of research at Brown. Andrew Bragdon received his Bachelors and Masters at Brown University and worked two internships with Microsoft. He is currently a PhD student at Brown. Ferdi Adeputra has been studying Applied Mathematics Computer Science at Brown University since 2007, and is currently employed with Goldman Sachs as an analyst. Hsu-Sheng Ko is currently at Brown University as well.
Summary:
Hypothesis
The developers put forth that an electronic paper and writing utensil could be made as a more useful and efficient tool for notation.
Methods
Participants in the study were asked to complete several tasks to explore the functionality of the writing pad. They were told to create and manipulate pages, perform a "back-of-the-envelope" calculation, as well as solve a more complex expression. In addition, they were asked to explore the graphing capabilities by graphing an equation and manipulating the graph, as well as using the PalmPrint to change modes and draw a diagram in different colors. Lastly, users were asked to try the web clipping and manipulating the contents of a page with TAP gestures and page folding. Aside from these specific tasks, participants were encouraged to play with the system and ask for help when necessary.
Results:
While there was strong sentiment expressed that the system would be more useful if it were portable, the overall reviews were positive. Users were able to effectively manipulate the paper and pen on a basic level, but required further instruction on how to access more advanced features. There were mixed reviews on the gestures and hand strokes, but most participants were able to get the hang of it after a demonstration. The participants were enthusiastic about the mathematical capabilities, but also noted that it would be better if it were developed even further.
Contents:
The system and trials focus on several key aspects of the design. It discusses page management, which includes a panning bar to display a small panorama of the workspace and the ability to "fold" a page for more space. It also discusses the gestures incorporated into the program, such as the under-the-rock menu that grows when dragging a term in a mathematical expression, the touch-activated pen gesture, and the palm-print motion that activates when an open hand is placed on the surface and commands are associated with each finger tip.
Discussion:
I find this whole idea very promising, and I hope that it can really take off in the near future. I feel that the authors did achieve their goals with this system, although there is certainly a lot of room to grow even more. I can see where it would have excellent application both in school and the workplace and I share the sentiments of the test participants in that I would like to see the mathematical capabilities expanded further. Something like this could be invaluable in a learning environment to aid understanding and explanation, but could also be used on a daily basis in an office or other work environment with day-to-day calculations.

Thursday, September 1, 2011

Paper Reading #1: Imaginary Interfaces: Spatial Interaction with Empty Hands and without Visual Feedback

References: Imaginary Interfaces: Spatial Interaction with Empty Hands and without Visual Feedback by Sean Gustafson, Daniel Bierwirth and Patrick Baudisch
Author Bios:
Sean Gustafson is currently a PhD student at the Human-Computer Interaction Lab of the Hasso Plattner Institute in Potsdam, Germany. He holds both a Bachelor and Master's degree in Computer Science from the University of Manitoba in Canada. Daniel Bierwirth currently resides in Berlin, Germany and is the cofounder of Matt Hattling & Company UG and the Agentur Richard GbR. H. He received his undergraduate degree in Computer Science and Media from Bauhaus University and his Master's degree in IT-Systems Engineering from the Hasoo-Plattner Institute in Germany Patrick Baudisch is a professor in Computer Science at Hasso Plattner Institute in Patsdam and the chair of the HCI Lab. He earned a PhD in Computer Science from Darmstadt University of Technology in Germany.
Summary:
Hypothesis
The authors had three separate hypothesis corresponding to their experiments, but the overarching idea was that effective spatial interaction could occur without the need for a screen or other obvious visual feedback.
The specific hypothesis for the first experiment was that participants would perform fewer Graffiti recognition errors than previously reported by Ni and Baudisch. They expected participants to build up visuospatial memory by watching their hands and completing the shapes. The hypothesis for the second experiment had two parts. First, that participants would be able to fully use their visuospatial memory while in a fixed position, but that body rotation would impair that ability. Second, that in the rotate conditions, there should be lower error in the hand condition than in the none condition. And finally, the hypotheses for the third experiment were that pointing accuracy would be highest at the fingertips and that pointing accuracy would decrease as the distance from the nearest fingertip increases.
Methods
For the first study, participants were asked to position their left hand like an "L" to mimic a coordinate origin and to reproduce a series of sketches with their right hand. For the second study, the participants drew a picture and then pointed out one of the vertices that they had drawn. And for the last experiment, participants were given target points in (thumb, index) length units and told to locate the position in their coordinate system.
Results

For the first experiment, 94.5% of the gestures were successfully recognized from the participant's drawings in the graffiti task, the average error for the repeated drawing task was about 2.2cm for the diamond and 3.25cm for the triangle, and the multi-stroke drawing task showed reasonable consistency in scale but less with stroke alignment and whitespace.
Contents
Between the three noted case studies, the goal was roughly the same. Each study attempted to measure the participants' ability to map and measure drawing in free space, as well as testing their ability to associate the spatial mapping and detail related to location.

Discussion:
I believe that the authors did achieve the goals that they set forth as far as determining that humans have some capacity for imaginary spatial imaging. I was not completely convinced that this approach would ever prove to be a preferable solution, or even particularly applicable, however. It is true that this paper is designed primarily to be a study of human ability rather than to actively put forth any particular ideas for development, and I think that it covered this very successfully. However, I feel that the paper itself is interesting and significant only as an exercise in creative thinking. It brings up some intriguing possibilities as far as how we might interface and interact with our technology in the future.