This site is about: (1) my professional self, (2) my research into cognition and (3) musings about the intersection of cognition and design.
Jason H. Wong
Basic cognitive research is a necessary component of successful user-centered design. Only through scientific thinking can we make technology intuitive and productive. My goal is to integrate basic research with useful applications.
Incredible analysis of a display
The XBox 360 dashboard is the user interface to the XBox operating system. When the game console is booted up, the user is presented with a display that looks like:
The user can navigate to different sections of the display using the XBox 360 controller to switch “Blades” - between, Games, Media, and other functions. It’s a fairly usable design, though it is difficult to make sense of initially (at least, based on my experience).
Over at the blog The Fanboys, there is a fantastic analysis of the 360 dashboard display. The dashboard is broken down into pixels and classified as being used for the user’s content, interactive items (buttons, menus, etc.), ad space, or blank space. The results are startling but also inform a smart redesign that minimizes dead space but does not lead to increased clutter. It’s a really impressive redesign.
The Fanboys: Dreaming of Dashboard 2.0
As someone who is starting to propose a display redesign for a submarine tactical system, this kind of analysis could be incredibly useful to implement. At the very least, it gets the mind thinking in a visual, yet quantitative, manner. Oftentimes, it is easy to be descriptive about changes that need to be done. But when you get sensible and realistic numbers, the case becomes far more convincing.
Animations and the iPhone
Animation in computer interface has been used for as long as the technology has been able to support it. The infamous “Clippy” in Microsoft 1997-2003 is an example of that. However, Clippy was almost universally despised because the animations tended to slow down completing a task. Even worse, when the user is idle (possibly thinking about something), the on-screen character would do a little dance, providing a distraction. No wonder why Clippy was even hated by Microsoft.

Animations can be useful, though, in the right context. On the iPhone/iPod Touch user interface, there are several examples of animations providing information about the state of the interface. This video below (created by me, which explains the awful production values), shows two instances of this. One is zooming and scrolling during navigation in Maps, and the other is scrolling in Safari, the web browser.
These animations naturally fit in the interface; they are not superfluous like Clippy. The zooming and scrolling in maps provides information about location and space. As users progress through the fake turn-by-turn directions, Maps could simply display the next turn. Instead, Maps zooms out from the old location and zooms in to the new location. This provides the user with a sense of where they are in the global sense, but also where they’ve come from in the relative sense (“We’ve driven very far southeast.”) This is useful in giving users a sense of situational awareness about the state of their trip.
The Safari animation is much more subtle – you scroll a page by tapping on the screen with your finger and dragging. When you reach the top or bottom of a page, trying to scroll more gives the user the sense of dragging the whole window, which visually implies to the user that there is no more to see. This is incredibly smart for this interface. In a regular computer interface, scroll bars are used to give the user a sense of position in the document. Scroll bars would not fit well in the iPod Touch interface, however, because many users would feel they had to use this bar to scroll, and the finger is not precise enough to grab a narrow area like that. Instead, the iPod Touch uses the entire screen as an effective scroll bar.
The downside to this is that there is no indication of document position, which is especially crucial at the top or bottom of the document. If the user is at the bottom but thinks there is more to see, the user may try to scroll. If the animation was not present and the interface did nothing, it would look like the scroll command performed by the finger did not register. This would prompt repeated actions by the user, all met by silence from the interface. Instead, this natural “rubbery” action by the interface signals that there is no more document to see. It’s natural, informative, and unobtrusive, which makes for an excellent use of animation.
Map of Science
Tomorrow I leave for my 10-week summer internship in Newport, RI at the Naval Undersea Warfare Center. I’ll be working on Human-Systems Integration, which seems to be a military-specific term for human factors engineering. However, Human-Systems Integration is more than just human factors. It takes into account selection of personnel, training, and both physical and psychological factors of users of systems. It brings together a lot of research and reminds me of how interconnected science is.
A little over a year ago, the Information Esthetics group published what is effectively a map of science, of Relationships among Scientific Paradigms. It’s continually fascinating to see the links between all kinds of different fields. Click the image for a full-size version of the image.
A description from Seed Magazine:
This map was constructed by sorting roughly 800,000 published papers into 776 different scientific paradigms (shown as pale circular nodes) based on how often the papers were cited together by authors of other papers. Links (curved black lines) were made between the paradigms that shared papers, then treated as rubber bands, holding similar paradigms nearer one another when a physical simulation forced every paradigm to repel every other; thus the layout derives directly from the data. Larger paradigms have more papers; node proximity and darker links indicate how many papers are shared between two paradigms. Flowing labels list common words unique to each paradigm, large labels general areas of scientific inquiry.
User interfaces in Iron Man
First off: possible spoilers ahead!
Iron Man was an excellent movie. However, since this blog is not dedicated the movie reviews, I thought I’d discuss some of the user interface elements involved instead.
This video highlights two UI elements that were well thought-out. There are more I’d like to discuss, but I couldn’t find them. There are two clips: one is of a holographic prototyping interface and the other is the Iron Man suit user interface and flight interface:
Click to download (6.5 MB MP4 video)
Holographic Prototyping and Direct Manipulation on the Cheap:
The first clip ends at around 26 seconds and shows off Tony Stark’s (aka Iron Man) holographic prototype interface. The hologram is a sci-fi cliche, but its usefulness is immediately evident. Direct manuipulation has been discussed before, and this typically requires something physical to manipulate. These physical prototypes are expensive to fabricate, especially if multiple revisions are needed. In the clip, Stark has already built out the specs for this piece of his suit, and he’s able to add and subtract parts and accurately visualize the effects of the modifications without fabricating lots of physical prototypes. The coup de grace is when he is able to stick his arm inside of the hologram and test it out. It’s direct manipulation of a prototype without the expense. Because this kind of manipulation is so natural, very little cognitive effort is needed to use this interface.
Flight Suit Interface and Transfer of Training:
The second clip starts at 27 seconds and demonstrates part of the suit’s user interface and the flight interface as well. The flight interface is very similar to that of a fighter jet, which will make transfer-of-training easy from a fighter jet to an Iron Man suit. This will cut down on the need to train users of the Iron Man suit - if they can fly a fighter jet, they can fly this suit.
Main Suit User Interface Voice Commands:
What was most interesting in the clip was how the general UI was controlled. The heads-up display is directly in front of the user’s face, but the user cannot touch the display. Therefore, direct manipulation is out of the question. The suit does take voice commands, as shown in the video. This is an obvious choice, but it is slow to use. Imagine flying at some insane speed under high stress - do you want to have to yell out a command that takes several seconds to issue, then wait for a reply from the suit? Probably not a good idea. The closest thing to voice interaction in this world is the Microsoft Sync system. This system integrates bluetooth phones and MP3 players into Ford cars and is all voice controlled. When it works, the eyes stay on the road for longer and less attention is required to make a phone call or play music. But when it doesn’t work, the error correction is simply a mess. It takes a huge amount of effort and is bad for driving or flying.
This is a great review of the Sync system (see 1:45 for an example of a Sync error as how the user simply gives up):
Main Suit User Interface and Eyetracking:
Besides voice commands, the other control option is eyetracking. If the eyes are focused on something in the environment, a command can be issued to zoom in, take a picture, etc. Issuing that command, however, would have to be done using a voice command or a button-press. The eyes are needed to focus, so something other than the eyes must issue a command. This is not the ideal situation because it requires coordination of multiple systems - the eyes must remain focused on the target while another body part confirms the command. Overall, though, this is not too much of a problem. It is similar to tracking the cursor on a computer screen and clicking the mouse with your finger. Nonetheless, it is a less elegant solution, especially during flight or in combat and the hands are required for another task.
Dueling Monitors
A study out of the University of Utah and written up in the Wall Street Journal’s Business Technology Blog showed that bigger monitors led to faster completion of document editing and spreadsheet tasks. There were three screens used: an 18-inch monitor, a 24-inch monitor, and two 20-inch monitors. Versus the 18-inch monitor, people were 52% faster with the 24-inch monitor and 44% faster with the two 20-inch monitors.
Now this is expected - bigger is better. But what I’m interested in is the 6% improvement moving from two 20-inch monitors to a 24-inch monitor. Two 20-inch monitors provide much more screen space, but it’s not just size that matters.
Egly, Driver & Rafal (1994) were the first researchers to show the existence of object-based attention. That is, attention does not just form a spotlight (or zoom lens) that illuminates a particular portion of the visual field. Instead, attention can also mold itself to encompass a specific object, and there is a cost in switching between objects. The methodology they used was particularly ingenious.
The task was simply to detect a block that would appear in one corner of either rectangle (see below). That was it - press a button when you see the block (right most panel). Before the block, though, other things happened. In the second panel, you see that one corner of one object was also cued - it suddenly turned red. Participants did NOT have to respond to the cue - only to the block. So participants started a trial, received a red cue, waited, and then a block would flash. Reaction time was the primary measure; how long it took participants to press a button after the block flashed.
The red cue served to prime attention to a certain location. In the example above, the red cue and the block target were in the same location, and reaction time was fastest. However, sometimes, the block could appear elsewhere. There are two critical conditions:
- The block was at the other end of the same object to where the cue was.
- The block was at the same end of the other object to where the cue was.
What is critical to note is that, in these two conditions, the block and cue are the exact same distance apart. If attention was purely spatial and did not care about objects, reaction times in both conditions should be the same. This was not the case, though. Instead, participants were faster at detecting the block when it was located at the other end of the same object as the cue. The cue brought attention to that location, then attention spread to the entire object. Therefore, when the target appeared on the same object, reaction time was faster. When the target appeared on the other object, attention had to be switched, and this lead to slower reaction times.
So what does this all mean for the research at hand? Two 20-inch monitors are two separate objects. Even if they are both placed perfectly in your field of view, you will have to make eye movements and shift attention between the two monitors. This is going to slow you down more than if you had a single object (a single 24-inch monitor) in front of you. In this case, you don’t have to switch your attention between objects.
This does beg the question, however: what exactly constitutes an object? Two separate physical monitors are certainly an object. But if you have two spreadsheets open and are copying data from one to another, does that count as switching between objects? Would it be better to copy and paste inside one spreadsheet and then make one large copy and paste to the new spreadsheet right at the end? I don’t know the answers to these questions, but they are certainly worthy of research.




