This site is about: (1) my professional self, (2) my research into cognition and (3) musings about the intersection of cognition and design.
Jason H. Wong
Basic cognitive research is a necessary component of successful user-centered design. Only through scientific thinking can we make technology intuitive and productive. My goal is to integrate basic research with useful applications.
The DoD needs cognitive psychologists!
SBIR call for “A psychologically inspired object recognition system”
The DoD has put out a call for proposals for the development of an object recognition system for computers that obeys psychological principles. Object recognition is obviously important for humans, and as more robots are being used in place of humans, they should also be able to identify objects to aid in mission success. The project description sounds like a short review of the object recognition literature, in fact (emphasis added by me):
Recognizing and identifying an object from a video input turns out to be a very difficult problem. The problem stems from the fact that a single object can be viewed from an infinite number of ways. By rotating, obscuring, or scaling a single object, one can create multiple representations of an object - which makes the problem of matching the object to a database of objects very difficult. The problem expands exponentially when objects that need to be identified have never been viewed before. Combine these limitations with the wide variety of objects which might be identified, and the problem becomes intractable. One solution is to study and understand how human beings recognize objects in the real world and duplicate that functionality in a series of algorithms. Recent research (Tarr and Bulthoff, 1995) has indicated that humans use not one algorithm, but multiple algorithms for the task of object recognition - depending on the object being recognized and the situation at hand. Specifically, research has shown that people use template based algorithms (i.e. similar to the database matching algorithms described earlier) in addition to Geon based (Beiderman, 1995) algorithms and feature based algorithms.
First of all, 1995 counts as recent research? Sounds like some DoD scientists need to attend the Vision Sciences conference. Secondly, it is satisfying to see that the DoD believes that understanding how the human mind works is a big step in implementing human-like cognition in artificial systems.
This is similar to the field of biorobotics, where the understanding of how natural organisms work (say, a dolphin) can be applied to machines (say, a submarine). This makes a lot of sense, actually. Biological organisms are highly evolved - nature has done the work of choosing what works best. By studying what works best, we can use those principles in designing our own machines. It seems like a new principle in engineering, but it makes a lot of sense.
ClearType: Font smoothing and usability
What a difference ClearType makes! Windows does not turn this feature on by default, and I am not used to displays without this feature. While I am not a typographer, the difference is quite evident to me. Without ClearType:
With ClearType:
These two links explain a lot about why ClearType works and why it eases eye strain.
Anti-aliasing on OS X
ClearType on Windows
This is a case of physical ergonomics, but not about understanding arm length or knee flexibility. Instead, it’s about understanding how our eyes work and adapting systems to that fact.
Animations and the iPhone
Animation in computer interface has been used for as long as the technology has been able to support it. The infamous “Clippy” in Microsoft 1997-2003 is an example of that. However, Clippy was almost universally despised because the animations tended to slow down completing a task. Even worse, when the user is idle (possibly thinking about something), the on-screen character would do a little dance, providing a distraction. No wonder why Clippy was even hated by Microsoft.

Animations can be useful, though, in the right context. On the iPhone/iPod Touch user interface, there are several examples of animations providing information about the state of the interface. This video below (created by me, which explains the awful production values), shows two instances of this. One is zooming and scrolling during navigation in Maps, and the other is scrolling in Safari, the web browser.
These animations naturally fit in the interface; they are not superfluous like Clippy. The zooming and scrolling in maps provides information about location and space. As users progress through the fake turn-by-turn directions, Maps could simply display the next turn. Instead, Maps zooms out from the old location and zooms in to the new location. This provides the user with a sense of where they are in the global sense, but also where they’ve come from in the relative sense (“We’ve driven very far southeast.”) This is useful in giving users a sense of situational awareness about the state of their trip.
The Safari animation is much more subtle – you scroll a page by tapping on the screen with your finger and dragging. When you reach the top or bottom of a page, trying to scroll more gives the user the sense of dragging the whole window, which visually implies to the user that there is no more to see. This is incredibly smart for this interface. In a regular computer interface, scroll bars are used to give the user a sense of position in the document. Scroll bars would not fit well in the iPod Touch interface, however, because many users would feel they had to use this bar to scroll, and the finger is not precise enough to grab a narrow area like that. Instead, the iPod Touch uses the entire screen as an effective scroll bar.
The downside to this is that there is no indication of document position, which is especially crucial at the top or bottom of the document. If the user is at the bottom but thinks there is more to see, the user may try to scroll. If the animation was not present and the interface did nothing, it would look like the scroll command performed by the finger did not register. This would prompt repeated actions by the user, all met by silence from the interface. Instead, this natural “rubbery” action by the interface signals that there is no more document to see. It’s natural, informative, and unobtrusive, which makes for an excellent use of animation.
Navigating in Three Dimensions
Recently, the Human Factors and Applied Cognition program had a guest speaker, Dr. Charles Oman from MIT. He spoke on spatial cognition in astronauts, because zero gravity is an entirely unique environment for navigation. There is no natural down - when you can orient yourself in any direction, it becomes much more difficult to anchor your perception of space to a single point. This makes for navigation and even basic perception a difficult task.
Here’s an example: you are an astronaut on the space shuttle and you fall asleep for your six hours in your sleeping bunk with the Earth below you when you look out the window and shut the blinds (there really are blinds). You wake up, not knowing the shuttle has rotated 180 degrees to do something. When you pull open the blinds, you expect Earth to be below, but instead, because the shuttle has rotated, it is above you. Your spatial sense is instantly destroyed, your feet and head are in the wrong place and - apparently - you vomit instantly. What you expect is not what you perceive or feel, and this leads to a massive body-environment disconnect.
While my line of work with the Navy should hopefully never lead to instant vomiting, this did get me thinking about navigation in a 3-D space. Normally, humans are flatlanders. However, in planes and on submarines (of direct interest to me), you have to think three-dimensionally, which we’re not so good at. How do submarine navigators learn to navigate in 3-D space? Does this improve their spatial skills? How good would they be at Tetris?
I am excited to begin learning about submariners, their training in navigation, and how systems need to be designed to take this extra dimension into account. I’ll have an expert group of participants for my experiments, which leads to all kinds of excellent ideas.
NASCAR: The necessity of top-notch vision.
So you’re driving down a racetrack at 200 mph. Things are flying past you at phenomenal speed, and you need to make sense of it all. Do you need spectacular vision? Sure. But it’s not just acuity (how good your vision where your visual focus is) that matters. What also matters is how well you can process things in the periphery. From an article about Nascar driver Tony Stewart in the New York Times:
For starters, Stewart has superb eyesight — 20/13 in one eye, 20/15 in the other — but it’s not visual acuity that matters so much as a driver’s ability to process everything that drifts into his periphery while he travels at 200 m.p.h. “A driver has to know what’s unfolding in front of him at a rate of a football field a second,” says Dr. Stephen Olvey, a founding fellow of the F.I.A. Institute for Motor Sport Safety.
When there is so much optic flow (a technical term meaning “stuff passing you as you are in motion”) is occurring, it makes sense that you have to be able to deal with something that suddenly appears in your peripheral vision, like another car, debris, or the wall. But not only do you have to detect that event, but you also have to react to it. You need to make a saccadic eye movement to bring the event from the periphery into the fovea - the small portion of central vision where acuity is the best.
A normal person can make a saccade within 250 ms (a rough estimate). That’s one-quarter of a second. 200 mph = 293 feet per second. Therefore, in the time it takes to make a single eye movement, you’ve traveled 73 feet at 200 mph. Add to that the fact that you’re effectively blind during a saccade, and suddenly 73 feet has passed before you know it.
Want to know how good your peripheral vision is? The New York Times article mentioned above has an excellent demonstration to see how good your peripheral vision is and how quickly you can move your eyes And if your performance is not as good as you’d like, you can train like a Nascar champ:
Greg Zipadelli, Stewart’s crew chief, says his driver hones his talent with a popular training tool: PlayStation.
Thanks to John Fedota for the link to the original New York Times article!

