Matt Nidoh: September 2011

Thursday, September 29, 2011

Paper Reading #13- Combining multiple depth cameras and projectors for interactions on, above and between surfaces

Title: Combining multiple depth cameras and projectors for interactions on, above and between surfaces
Reference Information:
Andrew Wilson and Hrvoje Benko . "Combining multiple depth cameras and projectors for interactions on, above and between surfaces". UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology ACM New York, NY, USA ©2010 ISBN: 978-1-4503-0271-5.
Author Bios:
Andrew Wilson-I was a graduate student at the University of North Carolina at Chapel Hill's Department of Computer Science from August 1997 - December 2002. I've since graduated and moved on to Sandia National Laboratories where I'm working on GPU-accelerated visualization of gigabyte-sized data sets.
Hrvoje Benko- I am a researcher at Adaptive Systems and Interaction Microsoft Research. My research is on novel surface computing technologies and their impact on human-computer interaction. Prior to working at Microsoft, I received my PhD at Columbia University, working on augmented reality projects that combine immersive experiences with interactive tabletops.
Summary:

Hypothesis: "The rich, almost analog feel of a
dense 3D mesh updated in real time invites an important
shift in thinking about computer vision: rather than struggling
to reduce the mesh to high-level abstract primitives,
many interactions can be achieved by less destructive
transformations and simulation on the mesh directly."
Methods: LightSpace allows users to interact with virtual objects within the realm of the environment. It supports through-body interactions, between-hands interactions, the picking up of objects, spatial interactions, and other features using unique depth analysis and algorithms. It is a hardware device suspended from the ceiling consisting of multiple depth cameras and projectors to track movements and interactions within the "smart room" environment. The cameras and projectors must be calibrated to share the same 3D environment.
Results: Via a user trial and feedback session lasting 3 days, the authors gathered that 6 users was the maximum (currently) number of users allowed by the "smart room" currently to still maintain accurate interaction representations. Also, the system seemed to slow down once more than 2 users were using it simultaneously. Occasionally, interactions from one user were never registered because another user was blocking a camera, so LightSpace never picked it up. Once they began to interact with the device, however, users felt that it was very natural and few had problems actually interacting.
Content: The authors presented a device that allowed users to live in a "virtual environment" and interact spatially with objects within a certain range. The authors explored different spatial interactions with new depth tracking techniques, and it seemed to produce fruitfully.

Discussion:
I think this is really cool to have and play around with. But just like recent papers we have read, I fail to see any applications in the real world for this technology so far. Their ideas with the depth tracking and figuring algorithms for interactions of different sorts was genius. I didn't like much how users could be blocked so easily from being registered, however. All it took was another user getting in the way of a camera angle to completely negate an action. I believe that the authors did achieve their goals. They set out to create a "smart room"- and that's what they got. If I had the resources I would definitely just buy one of these for a room in my house and just go nuts on it like I was working on my PC.

Monday, September 26, 2011

Paper Reading #12- Enabling beyond-surface interactions for interactive surface with an invisible projection

Title: Enabling beyond-surface interactions for interactive surface with an invisible projection
Reference Information:
Li-Wei Chen, Hsiang-Tao Wu, Hui-Shan Ko, Ju-Chun Ko, Home-Ru Lin, Mike Chen, Jane Hsu, Yi-Ping Hung, "Enabling beyond-surface interactions for interactive surface with an invisible projection". UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology ACM New York, NY, USA ©2010.
Author Bios:
Li-Wei Chen- Li-Wei's education: BSc Electrical Engineering, Queen's University, 1998. MASc Electrical Engineering, University of Toronto, 2000. PhD Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 2005.
Hsiang-Tao Wu- is a Microsoft Researcher.
Ju-Chun Ko- Ph. D (Candidate) student in Graduate Institute of Networking and Multimedia of National Taiwan University
Mike Chen- Professor in the Computer Science department at National Taiwan University.
Jane Hsu- Jane Hsu is a professor of Computer Science and Information Engineering at National Taiwan University
Yi-Ping Hung- Yi-Ping Hung received his B.S. in Electrical Engineering from the National Taiwan University, and his Master and Ph.D. degrees from Brown University. He is currently a professor in the Graduate Institute of Networking and Multimedia, and in the Department of Computer Science and Information Engineering, both at the National Taiwan University
Summary:

Hypothesis: If the authors can successfully create a programmable infrared (IR) technique that can interact with surfaces beyond a two-dimensional fashion, then enabling a spatially-aware multi-display and multi-touch tabletop system is inherently possible.
Methods: The authors use a modified DLP projector, mirrors, an IR projection, and color projection to implement their creation. To handle multiple inputs from other devices, their device will create 4 smaller markers around projections from closer devices and give that projection priority while also creating 4 successively larger markers around remaining input projections to capture all of it at once. To handle multi-touch inputs, the marker patterns become dynamic to resize angles, sharpen resolution, and recognize all inputs. The authors conducted a small user study asking participants to use their map application by finding and investigating buildings.
Results: By processing one layer at a time and by also simulating the background, the authors are able to retrieve proper inputs, associations, and recognitions. The authors also have the i-m-flashlight as an exploratory tool, the i-m-lamp as a "desk lamp", and a i-m-camera for the 3D views. All of these devices interact with one another and are connected with the table. The results from the user study showed some flaws in the implementation of the device. For example, if a user wanted to explore an entire building, they couldn't. Only the bottom portion of the structure was displayed. When users tried to "scroll" up to see the top using the i-m-lamp, the views went out of range. Users found that the i-m-flashlight was more efficient in working on short-term tasks while the i-m-lamp was more efficient in working with long-term tasks. Moving the i-m-lamp around a lot in a short period of time produced less accurate results. The users found quickly that the different pieces of technology were meant for different kinds of interactions.
Content: The authors wanted to create something that would be spread across a tabletop and handle multiple inputs on the actual surface in three dimensions. By using the prototype that they made out of modified projectors, screens, cameras, and mirrors, they were able to do that. Their device can also handle invisible markers for detection on multi-inputs. Three devices, the i-m- lamp, i-m-camera, and i-m-flashlight all coordinate with each other to handle 3D imaging, processing, and dynamic inputs. They conducted a study on early user feedback and used the results to gather methods and ways to improve their creation for future versions.

Discussion:
I liked the diagrams and pictures of this new device. These people were really smart to be able to basically construct something that can do all of these things and can support all of these features from scratch. I see definite uses for this technology, particularly in the construction and architecture fields. I think the authors achieved, at least partially, what they set out to do. They created a device, got user feedback, and can now improve on it for future releases. At least they were able to catch problems users had with it before they mass commercialized it. I think this can be a building block, but seemed a bit complicated to make. I got a little lost when they began to describe methods for constructing and implementing the device.

Paper Reading #11- Multitoe

Title: Multitoe
Reference Information:
Thomas Augsten, Konstantin Kaefer, Rene Meusel, Caroline Fetzer, Dorian Kanitz, Thomas Stoff, Torsten Becker, Christian Holz, and Patrick Baudisch, "Multitoe". UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology ACM New York, NY ©2010.
Author Bios:
Thomas Augsten- Thomas Augsten is a master student
in IT systems engineering at Hasso Plattner Institute in Potsdam
Konstantin Kaefer-Konstantin develops web applications as a node.js adherent and currently works on mapping software at Development Seed. He is a full-time Master’s student studying IT systems engineering at Hasso Plattner Institute in Potsdam, Germany.
Rene Meusel- During her nine-month final project she worked with five other students of the Hasso Plattner Institute at the implementation of an interactive floor named Multitoe.
Caroline Fetzer-She studied software development at the Hasso Plattner Institute in Potsdam since 2007.
Torsten Becker- He is a graduate student at the Hasso Plattner Institute and holds a B.Sc. in IT Systems Engineering. For my Master's he specialized in human-computer interaction and mobile & embedded systems.
Christian Holz- PhD student with Patrick Baudisch since 2009, Human-Computer Interaction (HCI) group, Hasso Plattner Institute (HPI), University of Potsdam, Germany.
Patrick Baudisch- is a professor in Computer Science at Hasso Plattner Institute in Berlin/Potsdam and chair of the Human Computer Interaction Lab. His research focuses on the miniaturization of mobile devices and touch input.
Summary:

Hypothesis: If a reasonably suitable design for a back-projected interactive floor can be constructed, then many technologies can build off of this because it is more advantageous than an interactive table top.
Methods: Since users will be actively walking on the interaction surface, there were some factors that needed to be taken into account if they wanted this to be effective. These factors include inadvertant interactions, consistent interfacing, desired "parts" of the foot for interaction, and precision with feet. The authors also conducted a study in which they allowed users to dictate what the interaction area for their feet should be. The authors also conducted a study concerning the "hot spot" for their feet. Authors then decided to test their "customizable" hot spot by studying users interacting with "small", "medium", and "large" keyboards.
Results: When looking at results from the study the authors conducted on users about what the contact area should be from their foot to the floor, the authors found that they did not match the FTIR contact model. Rather, users predominantly preferred a "projective" contact. Results from the second study concerning "hot spots" for a foot concluded that users had a wide range of "free choice" and would result in a large margain of error if a single, global hotspot on the foot was used to pinpoint interaction. In the third study about accuracy, the authors found that with the "small" keyboard, error rates were much higher and the time it took to accurately interact was about twice as long in comparison to the "large" keyboard. Users mostly perferred the medium keyboard because of its reachability over the large keyboard and its high degree of accuracy.
Content: The authors wished to create some technology that was similar to a table-top interaction surface but capable of handling more tasks or objects. Through pressure recognition, FTIR, algorithms for determining inadvertant input, balance, "tapping vs walking", user studies, and appropriate local and consistent interfaces, the authors created "multitoe".

Discussion:
I think the idea for this is really cool. I think it might take some time to learn all the neat things that you can do with this surface because it takes so many things into account like balance, pressure, and "intent". After you get learned on the technology, however, I think this would be really fun to play with. I also like how the "foot database" allows the floor to recognize you and your custom hotspots, etc. As far as applications for this floor go, I believe that anything you can do on a table-top surface can be easily moved to this multitoe surface. I believe the authors achieved their goals. They weren't looking to create some new device or a breakthrough invention- this was clearly an engineering experiment (is it possible? what is the best design? etc..).

Wednesday, September 21, 2011

Paper Reading #10- Sensing foot gestures from the pocket

Title: Sensing foot gestures from the pocket
Reference Info:
Jeremy Scott, David Dearman, Koji Yatani, and Khai Truong. "Sensing foot gestures from the pocket". UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology ACM New York, NY, USA ©2010 ISBN: 978-1-4503-0271-5.
Author Bios:
Jeremy Scott- Dr. Scott received his B.Sc., M.Sc., and Ph.D. in Pharmacology & Toxicology from the University of Western Ontario.
David Dearman- David is a Ph.D student in the department of Computer Science at the University of Toronto.
Koji Yatani- Koji is a Ph.D. candidate working with Prof. Khai N. Truong at the Dynamic Graphics Project, University of Toronto. His research interests lie in Human-Computer Interaction (HCI) and ubiquitous computing with an emphasis on hardware and sensing technologies.
Khai Truong- Khai is an Associate Professor at the University of Toronto in the department of Computer Science.
Summary:

Hypothesis: If a mobile device is located in someone's pocket at any time, then explicit foot movements can be defined as eyes-and-handsfree input gestures for interacting with the device with high degrees of accuracy.
Methods: The authors conducted an initial study to test users' abilities to lift and rotate their leg to perform foot-based interactions. A second study conducted tested a built-in accelerometer to recognize users' gestures using machine learning. They then conducted a study on participants asking them to make selections on menu items on the phone with foot-based gestures (four different ways of rotations/movements). They had computer software modeling users' feet and logging their movements.
Results: The initial study results showed that users can perform gestures more accurately while lifting their toes versus lifting their heels. It was also more comfortable to lift their toes versus their heels. The second study results featured recognition with 86% accuracy. Results from the trial containing participants making selections and performing gestures on the phone with their feet revealed that the less the foot had to move/rotate the faster selections were made. It was noted that across the board, hell rotation was by far the most comfortable way of using the foot as a gesture creator. However, according to the results, the toe rotation was the most efficient way of making selections and performing gestures (the error was the smallest with these trials).
Contents: These authors wanted to be able to create some device that could be interfaced with without having to demand a majority of attention, focus, and concentration to use while multitasking. Therefore, the authors created a phone with a built-in accelerometer to recognize foot-based gestures that users can interface with and use without having to even receive visual feedback. This way, users can multitask effectively while still using features on their phone.

Discussion:
I had a hard time reading this article honestly. I think this idea is just too weird and will never become a mainstream feature of technology. There is no visual feedback, and users are always going to prefer using touch/buttons to foot (of all ways...) interfacing. It is not natural and will still have to have users devote a decent level of concentration to using it. This means that their point about multitasking with this feature becomes pointless. I would never personally use this device myself or the technology. I do believe, however, that the authors did achieve their goals. They were able to use a foot to draw gestures and interface with a phone with a high degree of accuracy. I'm sure that they are proud of what they did, even if not too many other people care that much.

Monday, September 19, 2011

Paper Reading #9- Jogging over a distance between Europe and Australia

Title: Jogging over a distance between Europe and Australia
Reference Information:
Frank Vetere, Martin Gibbs, Darren Edge, Stefan Agamanolis, and Jennifer Sheridan, "Jogging over a distance between Europe and Australia". UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology ACM New York, NY, USA ©2010 ISBN: 978-1-4503-0271-5.
Author Bios:
Frank Vetere- Frank is a Senior Lecturer at the University of Melbourne. He has a Ph.D and is involved in the Interaction Design Group.
Martin Gibbs- Martin is a lecturer at the University of Melbourne. He is conducting Ph.D research in the social aspects of technology.
Darren Edge- Darren is a researcher in the Human-Computer Interaction Group at Microsoft Research Asia, based in Beijing, China. Darren spent seven years at the University of Cambridge, first obtaining his undergraduate degree in Computer Science and Management Studies, before completing his PhD dissertation in the Rainbow Group at the Computer Laboratory, under the supervision of Alan Blackwell.
Stefan Agamanolis- Stefan is currently Associate Director of the Rebecca D. Considine Research Institute at Akron Children's Hospital.
Jennifer Sheridan-Senior User Experience Consultant and Director of User Experience at BigDog Interactive with over 20 years experience in functional prototyping and design for global clients ranging from Nokia and the Technology Strategy Board to the Science Museum London. PhD in Computer Science, MSc in Human-Computer Interaction, and BA in Rhetoric and Professional Writing with a Computer Graphics specialization.
Summary:

Hypothesis: "Jogging over a distance" aims to spread out the exertion activity over a broader spectrum than just at the end (to have the activity be engaging between users DURING the activity versus just a post-activity interaction), integrating networks to allow spatially-separated users to exercise together, and to analyze the data for sociological comparison.
Methods: For the actual "Jogging over a distance" device design, users wear a headset, wireless heart rate monitor across their chest, and a small 'fanny pack' containing a minicomputer and a cell phone to store data and transmit audio. The target heart rate entered by the user is essential because "sound transmission" is focused on how fast the users heart is pumping. For example, if one user's heart was beating at 110BPM and another user's was beating at 150BPM (made up numbers), then the second user's "sound" would be heard by the first user as being "ahead" or "in front" of them. As far as testing their device goes, the authors rounded up volunteers to participate in test runs to rate the experience rather than how well it gauged their performance. There were 17 volunteers who had previous relationships with each other and were all social joggers.
Results: The data supported a claim that a social experience can be made from this technology. As a quote from the paper directly states, "Participants remarked on how the system facilitated a social experience similar to that experienced in co-located jogging: 'It was great, because we jogged together', 'This was almost as good as jogging together', 'I felt like he was there with me…' ". These quotes speak for themselves in terms of the success of the social aspect of the device. Users also commented positively about how they liked using this device because they could continue running at their pace rather than having to slow down to wait for their partner and maintain a light conversation. The authors also found that the device's audio support encouraged users to keep running and not to give up. It also made some users run harder. When their partner's voice was "in front", it made them want to run faster to keep up or pass their partner. This device also maps interactions by effort, not pace. This was attractive to users because people who ran at different paces could now interact with each other throughout the exercise via this device.
Contents: These authors set out to test and show off their work that they created. They designed this device in order to allow users to see their heart rate and effort they put forth into the activity and interact with others while engaging in this activity throughout the duration of the activity. The authors recruited some qualified volunteers to try out their device during test runs, and the participants were interviewed post-trial. Along with these interviews, data was collected from the runs in the minicomputers stored in the 'fanny packs'.

Discussion:
This device is really cool. I wholeheartedly support technologies like this. These types of devices allow users from very spatially-different places to interact in different ways than social medias online like facebook. This device integrates networking, audio transmission, heart rate comparison, spatial relationships, and data computation all in a single device. I could definitely see this device being a springboard for future technologies. I could even see a release in this type of device after a while (if this was v1.0, then I would imagine a '2.0' to be the release version). I believe the authors were very satisfied with their feedback and with the success of their device technologically. In my opinion, this was definitely a success and I am sure the authors look to build on this technology for a future publication.

Monday, September 12, 2011

Paper Reading #8- Gesture search: a tool for fast mobile data access

Title: Gesture search: a tool for fast mobile data access
Reference Information:
Yang Li. "Gesture search: a tool for fast mobile data access." Proceeding UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology ACM. New York, NY ©2010. ISBN: 978-1-4503-0271-5.
Author Bio:
Yang Li- Yang is a Senior Research Scientist at Google. Before joining Google's research team, Yang was a Research Associate in Computer Science & Engineering at the University of Washington and helped found the DUB (Design:Use:Build), a cross-campus HCI community. He earned a Ph.D. degree in Computer Science from the Chinese Academy of Sciences, and then did a postdoctoral research in EECS at the University of California at Berkeley.
Summary:

Hypothesis: "Intuitively, GUI-oriented touch input should have less variation in its trajectory than gestures."
Methods: Users can make gestures on the screen of the phone to invoke certain actions. To erase a gesture, the user can swipe from right to left on the bottom of the screen where a scaled-down version of the drawn gesture is displayed; to erase all current gestures, the user can swipe from left to right. The device can also use Gesture Search by incorporating "gesture history" into its algorithm. One main issue with Gesture Search is separating casual touch input events from "Gestures". Yang distincts the two by first receiving touch input, browsing a list of possible touch events as well as gesture inputs, and then decides what to do. If it is determined that the movement is a gesture, the displayed input will transition from tranclucent yellow (meaning action was pending based on deciding if input was a gesture or a usual touch event) to a brighter yellow, overlaying the screen on the phone and halting all "behind" actions and events. In the case of a touch event and not a gesture, Yang recruited volunteers with Android phones to study how they interact with their phones to incorporate determining touch inputs over gesture inputs into his algorithm.
Results: After collecting results, Yang figured out a way to implement an algorithm. For gestures that would require "drawing" and "tapping" (i.e., writing out a "j" gesture), Yang would process the "tap" input if it were first, and allow a buffer time to see if the user was going to "draw" any more; if the user did not process any more input during that time, the input was considered to be a touch event. Yang discovered that touch events to the GUI had narrower "areas of touch" and were "more square" then gesture inputs. After querying participants who tested Gesture Search, Yang received feedback on a scale of 1-5 in a Linkert Scale (one being the worst, five being the best). Users predominantly answered that Gesture Search was a 4 on this scale (meaning generally happy and saw use in the applications). Participants also answered with comments such as "did not interfere with normal touch inputs", "I did not see Gesture Search on my home screen when I didn't need it", and "I got to what I wanted to a lot faster."
Contents: Yang wrote this paper in order to publicize his invention of Gesture Search. It is a tool for users to quickly access mobile phone data, such as applications and contacts, by drawing gestures. Gesture Search seamlessly integrates gesture-based interaction and search for fast mobile data access. It demonstrates a novel way for coupling gestures with standard GUI interaction. Yang studied interactions with touchscreen phones by typical users as well as polled volunteers for feedback on the software. He found there is some room for improvement, but overall users were satisfied with the software.

Discussion:
I thought this was a weird idea actually. I don't think I would personally use something like this unless I was a user that had over 100 applications or something and needed to be able to access any one at any time relatively quickly. The idea is genius though. The algorithms and designs for implementing this were also really creative and well thought-out. This kind of imagination is definitely a springboard for other researchers. This kind of idea could also be carried over into other fields and could be used on other devices such as PCs (if they ever got touch screen desktop PCs....just saying). For example, if you wanted to load up a certain .exe from a directory in your system and didn't have a shortcut on your desktop (or the opposite- you wanted to load up an .exe but had a TON of shortcuts in your desktop and didn't want to spend minutes looking for it), you could just gesture and the .exe would load up. I thought Yang definitely achieved his goals for sure. He created what he set out to and got an overwhelmingly positive feedback in return. Some users felt there was room for improvement, but those fixes (as explained in the paper) could be easily fixed.

Paper Reading #7- Performance optimizations of virtual keyboards for stroke-based text entry on a touch-based tabletop

Title: Performance optimizations of virtual keyboards for stroke-based text entry on a touch-based tabletop.
Reference Information:
Jochen Rick. "Performance optimizations of virtual keyboards for stroke-based text entry on a touch-based tabletop." UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology ACM. New York, NY, USA. ©2010 ISBN: 978-1-4503-0271-5.
Author Bio:
Jochen Rick- Jeff is a junior faculty member in the new Department of Educational Technology at Saarland University. His primary research interest is how new media can better support collaborative learning. From 2007-10, he was a research fellow at the Open University, working on the ShareIT project. His primary role on the project was as technologist—designing and implementing novel pervasive computing applications. In May 2007, he received a PhD in Computer Science from the Georgia Institute of Technology
Summary:

Hypothesis: "While the stroke-based technique seems promising by itself, significant additional gains can be made by using a more-suitable keyboard layout."
Methods: In order to achieve a more efficient keyboard layout that is also comfortable for users to use on a tabletop layout, Jochen investigated strokes, natural movement, comfortability, and how to arrange the keyboard. Jochen recruited volunteers to participate in a study in which Jochen studied natural stroking movements.
Results: Jochen was successfully able to determine natural arm movements, which movements or changes in stroke direction take the longest, and that both right-handed and left-handed users could use the software with ease. But Jochen quickly discovered that finding the optimal layout for keys on the keyboard based on the selection of a "good" lexicon on "modern" language would take a very long time to compute, so he did not have results produced from that.
Contents: In this paper, Jochen set out to compare traditional keyboard layouts/implementations with a newly-proposed tabletop stroke-style virtual keyboard. Providing a brief history on the productions, reasons, and implementations for different kinds and styles of keyboards, Jochen is not satisfied with any proposed so far to be used on an interactive, table-top environment for different reasons including space conservation and comfortability for users.

Discussion:

I believe that Jochen did an excellent job at highlighting strengths and weaknesses of each specific kind of keyboard proposed throughout all of existence. Even Jochen's proposed virtual keyboard has limitations (see above in results). I never knew that there were that many different kinds of physical and virtual keyboard layouts in widespread use. I spent a decent amount of time looking at them, trying to figure out how I would get used to them and why each one of those designs were created originally. I felt that was interesting. I am not sure if Jochen reached his goals, however. I think Jochen really wanted to come up with a good, efficient design for a keyboard on an interactive table-top, but simply wasn't able to create an ultimate design how he had envisioned. I feel like other researchers could definitely take away some really good ideas and points from his work, but I'm not sure if someone will try to implement a keyboard how he had envisioned.

Paper Reading #6- TurKit: human computation algorithms on mechanical turk

Title: TurKit: human computation algorithms on mechanical turk
Reference Information:
Greg Little, Lydia Chilton, Max Goldman, and Robert Miller. "TurKit: human computation algorithms on mechanical turk". UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology ACM. New York, NY, USA. ©2010 ISBN: 978-1-4503-0271-5.
Author Bios:
Greg Little-
Lydia Chilton- a CS graduate student at the University of Washington. For the 2010 Academic year, she was an intern for Microsoft Research Asia in Beijing. From 2002 to 2009, she was at MIT where she did Economics '06, EECS '07 and an EECS M.Eng '09 advised by Rob Miller.
Max Goldman- Max is involved with Massachusetts Institute of Technology (MIT),
Department of Electrical Engineering and Computer Science (EECS), Computer Science and Artificial Intelligence Laboratory (CSAIL), and the User Interface Design (UID) Groups.
Robert Miller- Rob is an associate professor in the EECS department at MIT, and leader of the User Interface Design Group in the Computer Science and Artificial Intelligence Lab.
Summary:

Hypothesis: "As we explore the growing potential of human computation, new algorithms and workflows will need to be created and tested. TurKit makes it easy to prototype new human computation algorithms."
Methods: The authors considered a set of 20 TurKit experiments run over the past year including iterative writing, blurry text recognition, website clustering, brainstorming, and photo sorting. For iterative writing, a turker will write one paragraph outlining a specific goal. The process will show this paragraph to someone, asking them to improve upon it. The process can also have people vote on which paragraph is better between interations (i.e., if the "improved" paragraph should be kept or not). For blurry text recognition, this process will recognize text even when it is really unreadable. The algorithm works by having one turker make an initial guess as to what the words are, and then successive turkers will improve upon ciphering the words based on context, unique recognition, and previous guesses. For the other three areas of application (brainstorming, website clustering, and image sorting), a voting algorithm is used to prompt users which is "best".
Results: The authors actually found TurKit to be more successful than they thought. By this, I mean they found several unintended side benefits including "between crash modification", "implementation friendly", and "retroactive print-line-debugging" benefits. Some issues with TurKit that the authors and some sample users found, however, include knowing when to wrap a function in a "once" call, knowing which parts of the process could be modified, and knowing how to use the parallel features of TurKit properly. The authors found that the "crash and rerun" implementation sacrifices efficiency for a programming usability profit. They also found that they can easily run out of space to run the process due to re-executing every step each time the program is "crashed". Overall, they found the first launch of TurKit to be successful.
Conents: In this paper, the authors sought to show off their created work (which they named TurKit) to allow new human computational possibilities at a reasonable cost/efficiency. They tested it out in practical situations which may arise during common use of this device like blurry text, voting, polling, sorting, etc. After each experiment was conducted, results were gathered. The whole point of this creation was to allow a greater flexibility with human computational methods.

Discussion:
I thought this paper was really interesting. What interested me the most was their ability to poll humans and sort information based on essentially feedback while being able to disregard how long it takes to receive that feedback because of the "crash and rerun" style implementation. I believe the authors achieved the goals they initially set forth. The program seems to work wonderfully in the fields that were tested (namely blurry text, voting, polling, and iterative writing). This kind of software allows for easy collaboration of files, works, code, etc, between members of a team that might not all be in the same place at once. For example, if you wanted to submit an idea to a website for some product, you could send it over this TurKit and have other members of the team vote on it and then once everyone finally agrees on an idea/wording/etc, it can be posted up. This also allows great "general public" use. An example of this would be putting a poll on a site asking "which of these is your favorite?" There is definitely room to improve on this software as the authors have noted. The springboard possibilities for this kind of algorithm looks really good.

Wednesday, September 7, 2011

Paper Reading #5- A framework for robust and flexible handling of inputs with uncertainty

Title: A framework for robust and flexible handling of inputs with uncertainty
Reference Information:
Julia Schwarz, Scott Hudson, Jennifer Mankoff, and Andrew D. Wilson. "A framework for robust and flexible handling of inputs with uncertainty". UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology ACM New York, NY ©2010 ISBN: 978-1-4503-0271-5.
Author Bios:
Julia Schwarz is a PhD student at Carnegie Mellon University studying Human-Computer Interaction. She and Andrew Wilson work at Microsoft.
Scott Hudson is a Professor in the Human-Computer Interaction Institute within the School of Computer Science at Carnegie Mellon University. He earned his Ph.D. in Computer Science at the University of Colorado in 1986.
Jennifer Mankoff is an Associate Professor in the Human Computer Interaction Institute at Carnegie Mellon University, where she joined the faculty in 2004. She earned her B.A. at Oberlin College and her Ph.D. in Computer Science at the Georgia Institute of Technology advised by Gregory Abowd and Scott Hudson.
Andrew D. Wilson is a senior researcher at Microsoft Research. He obtained his BA at Cornell University, and MS and PhD at the MIT Media Laboratory.
Summary:

Hypothesis: The device that the authors present is the solution to the all-too-common problem of "uncertain input" being fed into different UIs of devices- this device is a systematic, extensible, and easily manipulated device to allow pleasurable experiences for users where input may not be read 100% accurately all of the time.
Methods: There are many differences between conventional input and uncertain input. With uncertain input, the PMF (probability mass function) is injected into an algorithm to try to determine what a user was attempting to do (taking into account that where they touch on the screen may have overlapped with some other touch-sensitive area, causing a clash in actions.) This device needed to be able to handle 4 different kinds of actions/events: modeling, dispatch, interpretation, and action. In the case of event modeling, they implement the PMF for event type as a collection of separate event objects, one for each type, each with an associated probability. The collection of alternative events and their associated probabilities then serves as the PMF over possible event types. For the dispatching events problem, Each interactor's state and the event type are considered when passing off an event to the dispatcher. Scores are then returned for each interactor (which can be thought of like a PMF) as to whether each one is a possible candidate for that event input. For the interpretation problem, the interactor must determine what possible action to take based upon the internal state of the interactor, the nature of the event, and the possible actions it can take. As far as the action is concerned, the mediator determines what action to take based on priority, machine state, and information from input (if there is even enough to process to take an action).
Results: Using six areas of focus, the authors were able to improve upon / make more certain input that was inputted into the system in order to claim that the right event/action was taken for each method of inputting "uncertainty" into the system. For instance, they have a way to have text show up in a textbox on a screen even when the user did not have a textbox selected to type into (i.e., when you bring up google and start typing before the page fully loads- the cursor is not in the textbox but you might start typing anyway). The other areas in which the authors were able to excel include speech recognition, improved pointing for the motor impaired, buttons for touch input, smart window resizing, and view sliders on the page.
Conents: In this paper, the authors sought to show off their created work to improve on input into a system that might not be so clear or easy to deal with. They tested it out in practical situations which may arise during common use of kinds of devices like an iPad, speech recognition software, etc. After each experiment was conducted, results were gathered. The whole point of this creation was to build on the improvement of interpreting input so that users would not become frustrated with their experiences using smart devices.

Discussion:
After reading through the results they got from each of the 6 fields of improvement, I am convinced that the authors achieved their goals. Their uses of a PMF (or an abstract representation of one) was very well done and I believe that this could be implemented in smart devices in order to improve users' experiences. For example, iPhone users' fingers are not all the same size. Some users might become frustrated when trying to type on the on-screen keyboard because they have bigger fingers. The error correction and interpreter on this device would either allow the user to choose which letter they meant, or disregard the letter and let them try again (which would be better than inputting a typo and not realizing it until a while later in the text, and having to go back and fix it). There are definitely uses for this to be incorporated into. This could even be a springboard for even more advancements and improvements in "uncertain" inputs.

Paper Reading #4- Gestalt: integrated support for implementation and analysis in machine learning

Title: Gestalt: integrated support for implementation and analysis in machine learning.
Reference Information:
Kayur Patel, Naomi Bancroft, Steven M. Drucker, James Fogarty, Andrew J. Ko, and James Landay. "Gestalt: integrated support for implementation and analysis in machine learning". UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology ACM New York, NY, USA ©2010 ISBN: 978-1-4503-0271-5.
Author Bios:
Kayur Patel is a Ph.D student (in Computer Science) at the University of Washington. Part of the DuB group advised by James Fogarty and James Landay. His work has been funded at points by either Microsoft or the government.
Naomi Bancroft is an undergraduate student at the University of Washington studying Computer Science. After she graduates she will go work for Google.
Steven M. Drucker is an affiliate professor at the University of Washington. He is also a Principal Researcher and manager of the VUE group in VIBE at Microsoft Research.
James Fogarty is an assistant professor in the Computer Science department at the University of Washington. He is also involved in the DuB group.
Andrew J. Ko is an assistant professor in the information school at the University of Washington. He has traveled around the country giving special lectures on HCI topics. He is also a member of the group dub.
James Landay is a professor in the Computer Science department at the University of Washington. He is a founder of the dub group. He was previously the director of Intel labs in Seattle.
Summary:

Hypothesis: A development environment for machine learning that is NOT domain-specific (aka a general purpose environment) is entirely feasible to create and implement.
Methods: To prove their hypothesis, the authors set out to achieve two main goals: being able to implement a classification pipeline and being able to analyze the data as it moves through said pipeline. They decided to test the implementations for these goals by applying methods for solving two problems- sentiment analysis (categorizing text) and gesture recognition. The authors saw that in order to achieve a general purpose supporting environment, they needed to explicitly define many specific steps and structures to the user. Gestalt uses relational tables addressing the entire pipeline to effectively manage all of their general purpose data. Because they use a single relational table, data does not need to be converted at all and users don't have to switch between tools for editing. To support many different kinds of data, Gestalt uses aggregated visualizations along the pipeline, so that the data is all connected. The authors recruited 8 volunteers who had at least taken one python class and a machine learning algorithms course to test Gestalt vs a "baseline" software that served a similar purpose. The study was whether or not the students could identify and fix the injected bugs in some trials faster and more efficiently with Gestalt or with baseline.
Results: Participants unanimously preferred Gestalt and were able to find and fix more bugs using Gestalt than using the baseline. After analyzing the study data, the authors found that the users spent more time analyzing rather than implementing in Gestalt and vice versa in baseline (which is preferrable). The students also were able to analyze the data with many more views then with baseline.
Contents: In this paper, the authors sought to show off their created work called Gestalt. They tested it out in practical situations with typical users other than themselves by recruiting volunteers with relevant experience and who were competant enough to use Gestalt to come in and be a part of their experiement to measure the success of their creation. After each experiment was conducted, results were gathered. The whole point of this creation was to build on machine learning mechanisms in a general sort of way (one that is domain-independent).

Discussion:
I'll be honest, this paper was sort of confusing. I'm not competant enough in machine learning or pipelining to be able to follow some of the arguments or descriptions that the authors were making. Their creation was definitely successful and their goals of the project were achieved. This is evident because of the feedback from the users. I definitely admire these kinds of projects and these kinds of people and their intelligence. As far as the future goes for this sort of technology, I can see improvements being made on Gestalt and allowing a greater degree of machine learning to be capable. I am convinced by their work not because I followed it 100%, but because of the outstanding satisfaction that the users had with Gestalt.

Monday, September 5, 2011

Paper Reading #3- Pen + touch = new tools

Title: Pen + touch = new tools
Reference Information:
Ken Hinckley, Koji Yatani, Michel Pahud, Nicole Coddington, Jenny Rodenhouse, Andy Wilson, Hrvoje Benko, Bill Buxton. "Pen + touch = new tools". UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology. ACM New York, NY, USA ©2010. ISBN: 978-1-4503-0271-5.
Author Bios:
Dr. Ken Hinckley is a Principal Researcher at Microsoft. He has been a part of many publications of relevance in the past. His Ph.D work involved developing a props-based interface for neurosurgeons.
Koji Yatani is a Ph.D candidate at the University of Toronto. He will finish it up this year and join Microsoft in the Human-Computer Interaction Research Group in Beijing.
Michel Pahud got a Ph.D in parallel computing from the Swiss Federal Institute of Technology. He is now with Microsoft and has been involved in several groups and group projects.
Nicole Coddington is a Senior Interaction Designer with STC. She used to be with Microsoft. She graduated from Florida State University with a Bachelor's degree in Visual Communications.
Jenny Rodenhouse is currently at Microsoft. She has worked with the XBox inside of the Interactive Entertainment Division. She has attended both the University of Wisconsin and Syracuse University.
Andy Wilson is a Senior Researcher at Microsoft. He earned his Bachelor's at Cornell University, as well as his Master's and Ph.D from MIT.
Hrvoje Benko works in the Microsoft Research group. He receieved his Ph.D from Columbia University.
Bill Buxton is a Principal Researcher at Microsoft. He earned his Master's in Computer Science from Toronto University and is still a staff member there as well.
Summary:

Hypothesis: It is possible to efficiently and comfortably satisfy both pen and touch technologies in a single device.
Methods: One way in which the authors tested how users interact with a physical piece of paper with a pen/manipulating tools (i.e. scissors) is by recording their interactions with a physical notebook. The authors told the 8 volunteers (all right-handed) to create a hypothetical short film and to be creative. The results of this experiment were taped and also physically recorded to find patterns of interactions. These results led to the implementation of the device's interactions with users. After the initial design was carefully thought through, another group of volunteers was chosen to test the smoothness and naturalness of the device.
Results: The authors found that the test users began to form habits around patterns of the system. Users commented frequently about how natural using the device was in relation to using a physical notebook or pen and paper. This was accredited to the careful study and planning of the implementation of the gestures to initiate the study. Only on the more complex features of the device did the users need guidance. By this, I (and the authors) mean that the pen+touch features combined were not clear; it was not "natural" to know what all the device was capable of, therefore, the users needed some instruction about some of these features.
Conents: In this paper, the authors and creators sought to show off their created work. Before designing the device, they planned case studies of typical students with physical notebooks to note interactions and gestures. After designing the device, they tested it out in practical situations with typical users to gather reactions and feedback from the users' trials with the product. The whole point of this experiment was to see whether the technologies of a pen and paper and of digital technologies and functions could be merged and implemented as a single device (since each technology has its own advantages and disadvantages).

Discussion:
I think that the idea of having pen+touch yielding new tools is great. These guys (and girls) really nailed the ways of "natural" use of a pen and of touching functions wielded into a single device. This device is really similar to the device referenced to in Blog#2. Both devices seem to work (relatively) flawlessly from what the experimental results showed. In this case, the users seemed very happy and satisfied with using this device and the period of time for a "learning curve" seemed to be short among the users. I believe that the authors definitely achieved their goals with this work. I believe that this device was superior to the "Hands-OnMath" device in that here the authors took the users into account moreso than the authors and creators of the "Hands-OnMath" device did. This kind of device could definitely become a mainstream device soon; it could also serve as a springboard for other devices or technologies to incorporate.

Paper Reading #2- Hands-on math: a page-based multi-touch and pen desktop for technical work and problem solving

Title: Hands-on math: a page-based multi-touch and pen desktop for technical work and problem solving
Reference Information:
Ferdi Adeputra, Andrew Bragdon, Hsu-Sheng Ko, and Robert Zeleznik. "Hands-on math: a page-based multi-touch and pen desktop for technical work and problem solving". UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology. ACM New York, NY, USA ©2010. ISBN: 978-1-4503-0271-5.
Author Bios:
Ferdi Adeputra currently is employed by Brown University within the Computer Science department. He is a teaching assistant for Professor John Hughes in a course entitled "Interactive Graphics".
Andrew Bragdon is currently a second-year Ph.D student in Computer Science at Brown University. Andrew has worked at Microsoft, traveled around giving special lectures, and published multiple works in the field of Human-Computer Interactions within Computer Science.
Hsu-Sheng Ko is also currently employed by Brown University within the Computer Science department as a teaching assistant.
Robert Zeleznik received his Master's Degree in Computer Science from Brown University. He is currently a Director for research in the computer graphics group over at Brown University. Robert has done a number of significant things in his past, including becoming a co-founder of CTO as a software architect.
Summary:

Hypothesis: "...if CAS tools were driven by a direct, multi-touch manipulation and digital ink within a free-form note-taking environment, students and even scientists and engineers might learn and work more effectively."
Methods: The authors tested different techniques on their product to see which excelled and which did not. They selected 9 udergraduate students from Brown University to test their product within a a laboratory environment, asking the volunteers to perform a series of actions with the "Hands-On" device and to rate the product at the conclusion of the experiment. These actions included creating and manipulating pages, performing calculations, solving complex math equations using multi-step derivations, graphing, using the software modules to change "modes" (say from ink to selection), web clipping, and manipulating contents of a page with the implemented gestures.
Results: Without even being told to, the volunteers began writing and manipulating pages. It was often noted that the volunteers "played" with the device and its features for longer than necessary to perform individual tasks. Not to the authors' surprise, none of the volunteers discovered all of the capabilities (such as page creation or deletion) without instruction. A lot of the features (for example, page clipping) were sought of as unnatural and unnecessary by the volunteers. Users really enjoyed having the "step-by-step" process for math-related work rather than just inputting a question and getting an answer spit out at them. This way, the users can see the steps taken to achieve an answer.
Contents: In this paper, four men (the authors) sought to show off their created work which they called "Hands-OnMath". They tested it out in practical situations with typical users other than themselves by recruiting random volunteers from the university to come in and be a part of their experiement to measure the success of their creation. After each experiment was conducted, results were gathered. The whole point of this was to see whether the technologies of a pen and paper and of a CAS (computer algebra system) could be merged and implemented as one (since each technology has its own advantages and disadvantages).

Discussion:
I believe that this creation definitely has a future and that this paper was somewhat interesting. I would tend to agree with a lot of the volunteers in that some gestures (such as manipulating data on a page) do not need a bi-manual action. The implementation of this device could use some work, in my opinion, but the device itself is a great proponent to mathematics and to the efficiency of working problems out. These guys seemed to have thought of a smart and simple way to change from a "stylus" to a "selector" per say, and to also incorporate error compensation into the way they implemented this system. One example of this is the page deletion technique that they implemented. I love how (to delete a page) you must drag the page off the screen, and then back on into the trash can rather than just off the screen. After all, you might just want the page out of the way instead of deleted. I also like the palm rejection technique. Because for people like me, I like to rest my hand on the page that I'm working on so I don't want the program to think I'm trying to do something that I'm not all of the time.