This site is about: (1) my professional self, (2) my research into cognition and (3) musings about the intersection of cognition and design.
Jason H. Wong
Basic cognitive research is a necessary component of successful user-centered design. Only through scientific thinking can we make technology intuitive and productive. My goal is to integrate basic research with useful applications.
From the January 2008 issue of Wired
FOUND: Artifacts of the Future
Or, as I would subtitle it: “A Human Factors Nightmare.”
All kidding and craziness aside, heads-up displays are coming into cars very quickly. Right now, they only display speed. The goal is to put information right in the driver’s field of view to minimize eye movements away from the road, similar to the Honda Civic dashboard. This makes sense from a cognitive perspective, except for one major flaw: people can only pay attention to one depth at a time. Therefore, attention must shifted from the road (relatively far away) to the windshield (much closer) in order to glean necessary information. That still leads to performance problems. What some systems are trying to do is to project the information “into the world” so that the speedometer is still on the windshield but appears on the road, so you don’t have to shift attention in depth. Smart.
Of course, while it’s easy to argue that the dashboard is way too cluttered to make driving safe, complex GPS systems are already part of high-end cars. They can be complicated to use and certainly provide a distraction if they make a mistake. Presumably, you’re driving in unfamiliar territory, and your map breaks, forcing you to fiddle with it?
There is a point of balance, where more information helps the driver drive better or multitask. But this makes human factors research all the more critical to ensure safety at all costs while not making driving an unbearable task.
Those automated voice menu systems
I had a heck of a time with a voice menu system earlier today. My flight from Chicago O’Hare (ORD) to Washington National (DCA) was canceled. I had to call United first to make sure that it was, and that required me to say my flight number, “616.” The system only heard “16″ and I had to sit through the entire flight information before I could restart and say the correct number. Then, once I confirmed that the flight was canceled, I had to listen to 5 menu options to try to rebook a flight. In my opinion, rebooking is too complicated to be done through a computer - I needed an agent. So I just gave up and started saying “Representative!” Then I had to listen to the system ask for confirmation, then ask me if I wanted to do a customer satisfaction survey. Then it transferred me, then I got a busy signal and hung up on. And then I had to do it again. The entire process took me about 30 minutes. Which gave me plenty of time to consider the content of this post.
First, some background:
The great thing about vision is that information can be presented in parallel: a single screen can display a ridiculous amount of information to absorb. Yes, attention is often a serial process, but the temporal dynamics of attention are under our control (as in, we choose how long we focus on something).
Audition, however, is a serial process. A spoken sentence is presented one word at a time, so you need to focus and remember every word. Also, the speed/loudness/etc. of the spoken information is not under our control.
There are two design choices I noted in United’s system that are necessary evils.
1.) The recorded voice speak very slowly. This enables everyone who calls to be able to understand the speaker. This makes the already slow process of audition even slower, annoying people who wants the speaker to go faster. But it would be so complicated to provide an option somehow to get the voice to speak faster that it likely wouldn’t be possible. This is only an annoyance to some of the population, which is better than leaving a portion of the population completely unable to use the system.
2.) The system allows voice responses along with touchpad responses. This is useful for people because, otherwise, people would have to move the phone away from their face to press buttons. Being able to speak back when spoken to reinforces the idea that you’re having a conversation. However, voice recognition systems are not that great. You have to speak slowly and clearly. If I can already anticipate the question, I will speak my answer. Sometimes, a voice system is not ready to pick that up immediately. Numbers on a phone make a limited number of tones, making recognition much easier. This is likely more of a technological than human factors problem.
These design choices are are coping mechanisms for the big problem with phone menu systems that is virtually unavoidable. You are trying to present multiple options to the user, and the user has to choose only one of them. This cries out for a parallel presentation of each possible option. However, since the phone is auditory, a serial presentation must be used, requiring working memory for which option is #1, which option is #2, and so on. And because users are impatient and don’t want to wait to see if option #7 is best for them, they may end up navigating to the wrong place. I have never come across a phone menu system that gets around this. The best solution I’ve seen is to keep pressing zero, saying “Representative”, or sounding angry. Some systems will automatically forward you to a person if you sound angry. That is pretty cool.
I personally haven’t encountered any good solutions to this problem. Likely, some good solutions that have not been widely implemented are:
1.) Keep instructions short for people who can catch on quickly, but allow for the person to say “Help” in order to get a more thorough explanation.
2.) Really limit the number of options for people to have to remember. Don’t have a menu that has 8 options. Keep the number of options to less than 4, if possible. Users that are patient enough to listen to all of them may have a tough time deciding which option to choose, and impatient users will just choose something just to get the system to stop talking.
3.) Improve voice recognition systems so instead of requiring users to speak a predefined option, allow the user to speak a few key phrases. Then the system could interpret those keywords and direct the user to the right place. It would be akin to searching for the Internet.
Here’s hoping that these design and technological issues are being worked on and implemented quickly! Customer service may actually become bearable, then.
GUI Wars: Web Browser Find Functions: Safari vs. Firefox
This is a great example of using attention research in user interface design. Standard Find functions in programs like Microsoft Word pop up a dialog box. You type what you want to find, then it highlights the word. It’s hard to find that highlighted word a lot of the time.
Firefox improves the search process by making the search box a bar that is part of the main window. Research has shown that attention often is distributed across discrete objects, and switching between objects incurs a cost (Egly, Driver & Rafal, 1994). With this layout, you don’t need to shift your attention between objects (though the search bar is all the way at the bottom):
The highlighted word is not that hard to find, but depending on where the word is, it can be difficult to do. In this case, you don’t incur an attentional shift cost from the Find window to the main browser window, but you do have to engage in costly visual search for the highlight word! Problematic.
The new version of Safari, however, fixes this incredibly well. It has the search bar right at the top, but it dims the entire page that’s not your search term and pops up and highlights in a bright yellow your search term. Luminance, motion, and color uniqueness. Talk about attention capture (Yantis & Jonides, 1984)!:
Making the Find tool part of the main window: excellent. Using animation to induce motion, brightness, and color uniqueness so that you can easily find what you were searching for? Genius.
Election Day!
Today is Election Day here in Virginia; we are voting for new State-level senators and delegates, along with local-level politicans. I voted today in the races where I felt educated. I did not vote for the new appointee to the soil and water board.
There was one big human factors issue from today: the voting machines. I haven’t seen too many problems while waiting in line, but there is one problem that I have always seen occur. Always. Once you’re done checking things off on the ballot, there is a big red button that says “VOTE” that you have to press. It looks like this:

You’re supposed to press the button, and then you get the confirmation screen:

The problem is that many people never realize that they have to press the big red button to actually cast their vote. Election officials often have to run after someone to have them press the button. Why?
If we adopt the mindset of the software developers, I believe the thinking goes like this: We want to give people one last chance to make changes, so there should be a big button at the very end that confirms once and for all that they will cast their ballot. It should be on its own screen to emphasize the finality of it. Let’s make it red and really big so people won’t miss it!
Back to the real world: this is valid thinking! People notice salient objects. Big bright things capture attention very nicely (Tsai & Peterson, 2006), and the fact that it says “VOTE” should indicate that people should press it to vote. So why don’t people press it? The thinking probably goes like this: I’ve checked off the boxes on who I want to vote for and the other ballot issues, so I’m done. I’m at a screen that says “VOTE” which must mean that it’s ready for the next person to vote. Great, all done!
Alternate ending: I see a screen that says “VOTE”… it looks like I’m done, I don’t think I have any buttons on this screen.
So what happened? Two things.
- The word “VOTE” on the big red button is not especially descriptive. It makes perfect sense that a new user would walk up to the screen and have to press the button to begin. Yes, there are instructions above the button, but who reads those? The button is big enough. Put the instructions on the button. Group them together on the salient object, and the instruction “PRESS HERE TO CAST YOUR BALLOT” should be informative.
- The button is really big. Like, unusually big. The rest of the interface doesn’t have buttons that big. It is possible that people can’t even tell that it’s a button that needs pressing. It’s not a user interface widget people are used to, so they don’t know what to do with it. Therefore, they don’t do anything.
Either way, these are both small issues that could be fixed. The clearer instructions would solve both issues. I’m sure these devices have been tested by users, but they’ve also been used for the past few years. Has there been no feedback from election officials on the Big Red Button issue? A simple change may save quite a bit of trouble.
Life with a digital speedometer
In my cognitive psychology class and on this blog, I have criticized the digital speedometer as not being as instantly useful as an analog speedometer, where you can quickly judge the angle of the needle and get a sense of how fast you are driving. However, buying a car is surprisingly emotional, and cognitive design principles can get thrown out of the window. In short: I bought a 2008 Honda Civic, and I’ve been driving it for about a week. How’s the digital speedometer?
It’s… usable. I don’t think it’s better than an analog speedometer. I can glance down and read the number, and interpreting my speed does not really take that much longer than an analog display. So I can get along just fine with it.
I do, however, take issue with why Honda made the change (source: Honda Press Release for the 2006 Civics, PDF version):
And they state the reason clearly again in the same press release:
The two-tier instrument panel positions priority gauges like the speedometer up high in the driver’s field of vision.
So Honda wanted to put the speedometer - critical information, to be sure - closer to the driver’s field of view. It’s a valid design choice, but why not move the analog speedometer upwards? My guess is that a traditional circular analog display would have been too large and would have interfered with driving. Therefore, as a compromise, Honda made the speedometer digital to save space. It doesn’t take up as much space as an analog version, and it has the benefit of looking pretty cool.
The problem with this is twofold. The first problem is that our peripheral vision is really poor. Our sharpest vision is only 1 degree of visual angle (the width of your thumb an arm’s length away is 1-2 degrees). Everything outside of that area is not very sharp. We can get some information out of it - maybe the angle of a line (analog speedometer) - but reading (digital speedometer) gets fuzzy. So Honda put the display closer to the driver’s field of vision, but made it more difficult to read by going digital. To make matters worse, there’s a concept called the Useful Field of View (Ball et al., 1988). Essentially, the more focused we have to be at a central task, the less we can pay attention to peripheral objects and events. Driving is an attention-heavy task, so we have even less attention to take in information in the periphery.
The end result? I still have to move my eyes away from the road and down in order to read my speed. Moving critical information closer to the driver’s visual field is good idea, but the implementation failed because of lack of knowledge about vision and cognition. By not understanding peripheral vision or the effects of attention to a central task, the repositioned digital speedometer fails at its goal of being better. It’s not a bad design, per se, but it does not improve the driving experience.




