Skip navigation | Accessibility statement | Site map

Alex Poole - Interaction design and research

Version française

Back to contents

Abstract

Bookmarks are a useful way for people to return to web pages they visited previously, but it can be difficult to find a specific link when the bookmark collection grows too large. Graphical aids do exist to make bookmarks 'stand out', such as icons and thumbnails, but these are not universally applied. This study attempts to find the optimum way of writing bookmarks so that they can be recognised more effectively in a visual search of the bookmark menu when no graphical aids are available.

30 post-graduate students were presented with a series of news websites followed each time by a menu of bookmarks. Their task was to find the bookmark they thought corresponded to the website they had just seen. The structure of the bookmark was manipulated (top-down or bottom-up information structures) as well as the number of informational cues (one, two or three). The time taken to find each bookmark was measured and eye movement data was gathered to provide a deeper understanding of the participants' visual search behaviour and related cognitive processing.

The number of cues on display in a bookmark was a significant factor in recognition time, where two cues were found to be necessary for optimal recognition, one cue was found to be highly sub-optimal and a third cue added no recognition value at all. However, top-down and bottom-up bookmark structures were found to be equally salient.

Keywords

Bookmarks, eye-tracking, information salience, visual search, World Wide Web

Back to contents

1. Introduction

1.1) Keeping found things found

Since its inception 12 years ago, the World Wide Web has experienced a phenomenal growth rate far beyond that of any other comparable media. Accurate estimates on the size of the web are hard to acquire, but the largest search engine, Google, claims to hold 3 billion web pages in its database ('Benefits of Google'), but this can only be a small fraction of the total number of web pages currently in existence (Lawrence & Giles, 1999).

Although the Web serves as the primary information resource for many people, its massively increasing size and complexity has made 'information overload' one of the biggest and most obvious drawbacks of the technological age. Thankfully, in recent years finding resources on the web has been made easier with modern search engines such as Google, together with more refined search functions found within websites themselves. But managing to successfully find a web page invites a secondary problem - how do you keep it 'found'? (Jones, Bruce & Dumais, 2001).

Users have many different methods of 'keeping' resources found on the web. They save whole pages to their hard drives, print them out, send URLs to themselves in an email, write them down on a piece of paper or add them to the "bookmarks" list in their web browser (Jones, Bruce & Dumais, 2001; Cockburn & McKenzie, 2000; Tauscher & Greenberg, 1997). The last method, 'bookmarking', will be the focus of this study.

1.2) Bookmark basics

Bookmarks have been in existence since the creation of the first World Wide Web browser in 1991 (Cailliau, 2002), and have been adopted by most web browsers as a standard navigation and revisitation tool, but referred to by different names for reasons of marketing. The term 'bookmark' is used in the Netscape Navigator browser, the equivalent term being 'favorites' in Internet Explorer, as shown in Figure 1 below (The term bookmark will be used throughout this paper and is synonymous with 'favorites' and 'links').

Figure 1
'bookmarks' and 'favorites' menus.

Netscape Navigator and Internet Explorer present bookmarks in slightly different ways but essential functionality is identical

The text in a bookmark begins its life as the title of a web page, found in the <title> tag in the html code that is used to build the page (Figure 2).

Figure 2
The title text comes from the <title> tag as defined in the HTML code.

A fragment of HTML code showing a the <title> tag

The content of this tag is extracted and used for various functions in Microsoft Windows. It is used firstly as the title of the page in the top bar of the web browser and for the icon representing the browser on the Start bar when the browser is minimised (Figure 3). If the user decides to save the web page to their hard drive, the title is used as the filename. The <title> text also appears in the 'History' list of the web browser and of course in the bookmark if the user decides to keep the page by that method. Finally, the text also appears in the tool tip that pops up when the mouse pointer is held over the bookmark in Internet Explorer (Figure 3).

Figure 3
The <title> tag text is extracted and used for various functions in Windows: It appears in the browser's top bar, as the 'Start' bar icon and in the history list. It is also used for the bookmark and the corresponding the tool tip, and to name a saved file.

Examples of how the <title> tag text is used in various windows functions

It is important to note that the text in the <title> tag does not actually appear on the web page itself, and is not necessarily the same as the 'title' appearing within the web page, which has to be defined separately by the author (Figure 4).

Figure 4
An example of when the title text does not match the 'real' title of the page.

The browser's top bar reads "Orlando Sentinel: Space" but the headline on the page is "Nasa knew risks, let safety slide"

1.3) Good housekeeping

There are a few basic things a web author should do in order to write an acceptable bookmark, based on the complaints of web users (Kassten, Greenberg, & Edwards, 2002; Cockburn, Greenberg, Jones, McKenzie & Moyle, 2003).

First, they must remember to actually define the <title> tag. If the <title> tag is empty or even missing from the HTML code, the filename and directory path of the page will be shown instead of a meaningful title, for example http://www.hppmusicindex.com/out.asp or http://thezfiles.co.uk/seek_ae5663dc.htm.

If the author is using web publishing software such as Macromedia Dreamweaver, the programme's default text will be displayed if they don't define the <title>. This can be recognised frequently on the Web by pages marked "Untitled".

The author must ensure that the <title> tag and the 'title' within the page actually match. Differences between the two have been cited by users as a major annoyance when trying to locate a bookmark (Kassten, et al., 2002). Also, authors should ensure that each page on their website has a unique title to aid multiple bookmarking of pages from the same site.

Lastly, the author has to make the title fit within the bookmark character length limit. In Windows, the maximum length for a bookmark is 255 characters (including spaces), but only the first 65 characters on average will be visible in the 'favorites' menu in Internet Explorer, although all 255 characters should appear in the tool tip (see Figure 5 below). Only average capacity can be given as the amount of words visible will depend on the thickness of the letters used (if Windows used a monospaced font for menus, the character limit would be identical each time).

Figure 5
On the 'favorites' menu in Internet Explorer, the tool tip displays 255 characters while the bookmark only displays 65 characters on average.

On the 'favorites' menu in Internet Explorer, the tool tip displays 255 characters while the bookmark only displays 65 characters on average.

Back to contents

2. Purpose of the Experiment

2.1) Rationale for studying text-only bookmarks

Bookmarks are a convenient way to revisit web pages, until your bookmark list grows so large you can no longer find the bookmark you need. This task becomes even more difficult when returning to the list after a long time, most likely with a fragmented memory of what the bookmark text actually was.

To address these problems, there has been a lot of productive research in making bookmarks easier to find and organise (Cockburn & Greenberg, 1999; Cockburn et al., 2003; Kassten, et al., 2002; Abrams, Baecker & Chignell, 1998; Tauscher & Greenberg, 1997). Custom icons can make the bookmark reference stand out, as shown in Figure 6.

Figure 6
Custom icons can make bookmarks easier to spot.

Some custom icons on a 'favorites' menu

An extension of this idea is the use of thumbnail images of the websites themselves next to the text bookmark, as in Figure 7. This has been shown to be a successful approach (Cockburn, et al., 2003), but is yet to adopted by Internet Explorer as a standard revisitation mechanism.

Figure 7
Thumbnail images of web sites next to their text bookmarks can aid recognition (Figure Cockburn & Greenberg, 1999).

A prototype thumbnail bookmark menu developed by the University of Calgary (Figure copyright Cockburn & Greenberg, 1999)

The research states that there are severe limitations to using the <title> tag text for bookmarks, as mentioned in the previous section (Kassten, et al., 2002; Cockburn, et al. 2003). However, these 'limitations' are not inherent in the <title> tag system, but stem from Web producers' own habits. Icons are subject to the same limitations, since they are also created by Web producers. Some of these icons can be rather obscure and may not in fact aid recognition of the bookmark.

Furthermore, the advantages of icons may be short lived. If the use of custom icons becomes widespread, and every bookmark has one attached, their 'pop out' value will be greatly reduced and recognition time is likely to be just as slow as it can be for text.

Thumbnails also have their own recognisability problems. Text-based pages are hard to recognise at any resolution and pages from web sites that are consistently designed are hard to differentiate (Cockburn & Greenberg, 1999). Thumbnails also consume a high proportion of screen real-estate. Each bookmark on the favorites menu in Internet Explorer occupies 20 pixels of vertical space, however, to achieve just a 60% chance of recognising a particular web page, a thumbnail 144 pixels high is required (Kassten, et al., 2002).

Accessibility and usability may also be problematic for visual recognition aids. Icons and thumbnails are of little to no benefit for visually impaired users, but plain text can always be interpreted by voice web browsers. Similarly, other systems such as file organisers, search engines and databases may not be able to interpret graphical representations. For example, it may be difficult to implement automatic and meaningful bookmark sorting based on graphical properties.

In terms of usability, it is not clear if icons and thumbnails will transpose well to PDAs and mobile phones. These devices have extremely limited screen real-estate, and thumbnails in particular may have to fill most of the screen in order to be recognised.

The research has shown that these visual aids can make bookmarks stand out, but this same research does not propose how to make web pages easier to recognise when they are represented by standard text-only bookmarks.

It is clear that text-based referencing is still a major force on the Web and warrants continued research and improvement. This study is intended as an initial step towards this aim by finding the factors that most affect the recognition of text bookmarks.

2.2) Types of bookmark: Top-down & Bottom-up structures

Most web producers use their common sense when writing bookmark text. They ensure that the text satisfies some basic criteria, then choose some appropriate information to identify the page, such as the name of the website or the subject of the page, etc. (Table 1).

Table 1
Potential appropriate contents for a bookmark.
Identifying information
  • The content or subject of the page
  • The name of the website that hosts the page
  • A description of the host website
  • The name of the author
  • The date
  • The name of the section of the website that the page comes from (could be useful in very large websites where the same page title appears in more than one context)
  • Reference numbers

They then have to choose how they will put this information together.

Many web producers who manage large websites with several levels of information choose to model the <title> tag text on how this information is organised on the site. This can help users while they navigate, because their navigation trail is built up in a logical way, giving them feedback on where they are, and how they got there, as shown in Figure 8 (Preece, Rogers & Sharp, 2002).

Figure 8
The navigation trail through a website can be shown in the browser's top bar

The browser's top bar shows the top information level: "BBC", then the next level when we navigate deeper: "BBC | News front page"

Two common ways of describing these information structures are 'top-down' and 'bottom-up' (Rosenfeld & Morville, 2002). A top-down structure may list the name of the site, followed by one or more sections and finally the title of the page (Table 2).

Table 2
Bookmarks with a 'top-down' information structure.
Model: site name - section name - page title
Examples: BBC NEWS - Middle East - Top Saddam official surrenders
Lancaster University - Postgraduate students - Alex Poole

Conversely, a bottom-up structure starts with the title of the page, and ends with the name of the site (Table 3).

Table 3
Bookmarks with a 'bottom-up' information structure.
Model: page title - section name - site name
Examples: Top Saddam official surrenders - Middle East - BBC NEWS
Alex Poole - Postgraduate students - Lancaster University

Both structures could reasonably identify a page, but how do we know which one will be most recognisable to users when they are searching in a large list, with imperfect memory?

2.3) Saliency and recognition

The assumption for the present study is that the information at the start of the bookmark is somewhat dominant. The leading information defines the structure of the bookmark - if it starts with the title of page, it must be bottom-up, and if it leads with the name of the site, it must be top-down.

Also, the leading information has a higher 'profile' as users tend to 'scan' down the left-hand side of a menu (Altonen, Hyrskykari & Räihä, 1998), sometimes only reading the first word or two of each list item.

Bearing this in mind, if there is a difference in saliency between the title of the page and the name of the site, this should affect the salience of the bookmark as a whole.

The second assumption for the present study is that the page title is likely to be more salient than the site name, meaning that bottom-up structures may be more salient than top-down structures. There are several reasons why this might be so.

Firstly, users actions are driven by goals and tasks (Preece, et al., 2002). Visually searching the bookmark menu is an example of goal-driven behaviour - the user is searching the menu to find a particular bookmark, for a particular reason. Anything that is tailored to the user's task will improve the usability of the system (Nielsen, 1992). Since the title of the page describes what the user was specifically reading, while the site name may be completely unconnected to the subject matter, it is likely that the page title may fit the user's task more than the name of the site, improving relevance and potentially improving recognition.

Secondly, the fuller descriptions afforded by page titles may be more likely to evoke stronger mental imagery, which is known to aid memory and recognition (Clark & Paivio, 1987). Likewise, they may fit better into our existing knowledge structures, aiding subsequent recognition (Alba & Hasher, 1983; Bartlett, 1932).

The final assumption is that the number of components in the bookmark is likely to affect recognition. The more pieces of information that are displayed, the more we will able to infer the meaning or identity of the whole bookmark. The possible interpretations are constrained by the context brought by the extra information (Rumelhart & Norman, 1985).

Measurement

In the present study, recognition will be measured by the time taken to find a target bookmark embedded within a set of distractor bookmarks. Faster times will be taken as indicating superior recognition.

Eye movements will also be used as a measure of information salience: In particular, the number of fixations on a bookmark component, together with fixation duration, will be taken as an index of relative salience. A detailed justification for using eye movements is provided in section 3. Suffice it to say that the use of eye movements in the present study is based on the assumption that they provide an on-line measure of the processing demands associated with items of information, such that more processing would reflect decreased salience and less processing would reflect improved salience.

Back to contents

3. Eye Movements

3.1) Eye movement basics

When reading, it feels as though our eyes are moving smoothly along the page in one smooth sweep, but this is an illusion. Our eyes actually move along a line of text in a series of quick jerks called 'saccades', each lasting typically for 20 to 35 milliseconds (ms). After each saccade, the eyes stay relatively still while taking in, or 'encoding' information. These moments are called 'fixations', and they last for 218ms on average, although the range is 66 to 416ms. Sometimes the eyes move back in the direction of text that has already been read - these regressive saccades are known as 'regressions' (Rayner & Pollatsek, 1989). Scan paths are recurring patterns of saccades and fixations - in this study, it is assumed that people adopt a scan path which runs over the far left of the bookmark menu (Altonen, Hyrskykari & Räihä, 1998).

The information available in a fixation is defined by the total perceptual span - it is the region in which letters can be recognised as well as the spaces between words (Figure 9). Due to the anatomy of the eye, visual acuity drops off towards the edges of the perceptual span, providing only lower grade information at the extremes. (Rayner & Pollatsek, 1989; Lansdale & Ormerod, 1994)

Information on word length can be picked up from up to 12-15 characters to the right of the fixation centre and 3-4 characters to the left, although specific letter information can only be detected up to 10 characters to the right of centre. The area in which reliable and accurate word identification takes place is actually 7-8 characters to the right and 3-4 characters to the left of the centre of the fixation. This is known as the 'word identification span' (Ojanpää, Näsänen & Kojo, 2002).

Figure 9
Perceptual span with decreasing acuity at the extremes.

Accurate word identification ranges from 3-4 characters to the left and 7-8 characters to the right of the centre of the eye fixation

The form of the perceptual span is not set at birth, but learnt according to which language system the person grows up with. For readers of languages that are written from right to left such as Arabic and Hebrew, the perceptual span is reversed, with 3-4 characters to the right and 12-15 characters to the left of the fixation centre (Rayner & Pollatsek, 1989).

3.2) Interpreting eye movements

Eye movements are in many ways a purer measure of recognition (and therefore saliency) than simple response time. Measuring the eye movements on one particular bookmark eliminates the time spent searching for it, as well as the time spent selecting it (Zelinsky & Sheinberg, 1995).

Although some researchers argue that fixations and saccades are programmed automatically regardless of the words being processed (O'Regan, 1992), the assumption for this paper is that these eye movements, especially fixation durations, are an index of the amount of cognitive processing being applied to the item being fixated (Goldberg & Kotval, 1999; Just & Carpenter, 1976).

Fixations

Fixation frequency and duration are the main measures of cognitive activity in the present study. Fixations can be interpreted quite differently depending on the situation. In an encoding task (browsing a web page for example), higher fixation frequency on a particular area can be indicative of greater interest in the target, such as a photograph in a news report, or it can be a sign that the target is complex in some way and more is difficult to encode. (Just & Carpenter, 1976; Jacob & Karn, 2003)

However, these same interpretations are reversed in a search task - a higher number of single fixations or clusters of fixations are an index of greater uncertainty in recognising the target. (Jacob & Karn, 2003).

The duration of a fixation is also linked to the processing time applied to the object being fixated (Just & Carpenter, 1976). It is widely accepted that "representations that require long fixations are not as meaningful to the user as those with shorter fixation durations" (Goldberg & Kotval, 1999).

Saccades

No encoding takes place during saccades, so they cannot tell us anything about the complexity or saliency of the target phrase. However, regressions can act as a measure of processing difficulty during encoding (Rayner & Pollatsek, 1989). Although most regressions are very small, only skipping back two or three letters, much larger phrase-length regressions can represent confusion in higher-level processing of the text (Rayner & Pollatsek, 1989). Regressions could equally be used as a measure of recognition value - There should be an inverse relationship between the number of regressions and the saliency of the phrase. However, access was unavailable to software that could identify regressions so this type of saccade was left out of the analysis.

A further reason for not analysing saccade data in fine detail is that the eye tracker used in this study was not in fact optimised for measuring saccades - the method it uses to record eye movements defines it as a "fixation picker" (Karn, Goldberg, McConkie, Rojna, Salvucci, Senders, Vertegaal & Wooding, 2000). If the study is replicated using a different type of eye tracker optimised for measuring saccades, (a "saccade picker"), very different data may be produced.

Other eye movements

Pupil dilation is a potentially interesting measurement of cognitive workload (Marshall, 2000; Steinhauer & Hakerem, 1992). A lower cognitive workload may be used as an indication of increased saliency due to lower processing demands (Marshall, 2000). Unfortunately the eye tracking system that was used did not have sufficient resolution or the necessary software to make accurate measurements of pupil dilation.

Back to contents

4. The Experiment

4.1) Participants

30 mainly post-graduate students took part in the experiment, with an average age of 30-35, and an age range of 15 - 65. 12 were female and 18 were male.

All participants had normal or corrected-to-normal vision and were regular users of World Wide Web, with an average of 7 years experience. All but one reported that Internet Explorer was their main web browser. Some had taken part in other eye tracking experiments but none were aware of the research hypotheses in the present study. All were paid 3 for the 30-minute duration of the experiment.

Prior to the 30 participants who completed the experiment, six could not be calibrated with the eye tracker, data from two participants were excluded as the participants did not follow the instructions and a further four were eliminated to ensure that the remaining sample of 30 contained only native English speakers.

Most of the participants reported that they had never seen the websites used in the test, although six participants stated that they were familiar with one or two of the websites, but didn't use them regularly.

4.2) Materials and design

A 2x3 within subjects design was used, the first factor being the bookmark structure (either top-down or bottom-up), the second factor being the number of information components, or 'cues' available in the bookmark (One, two or three cues) (Table 4).

Table 4
Experimental conditions in a 2x3 within subjects design (conditions are labelled a - f).
Bookmark Structure Number of cues
1 2 3
Top-down a) Site name c) Site name -
Article title
e) Site name -
Section name -
Article title
Bottom-up b) Article title d) Article title -
Site name
f) Article title -
Section name -
Site name

The format of the experiment was straightforward - participants were asked to view a series of websites then find its corresponding bookmark in the menu that followed each time.

A set of 24 web pages containing articles on international news and current affairs were collected from news websites and saved as screenshots (Figure 10) (See Appendix A & Appendix B for a full listing). The chosen web pages all had a clear site name, article title and section name, ensuring equal opportunity of encoding for later recognition.

As they were static screenshots, the body text of the article was often not fully in view, and participants were not able to 'scroll down' to read the rest of the news story. The original title bar text was deleted from each website screenshot to allow the full manipulation of the bookmark text (this is necessary as the title bar and bookmark text is the same, as mentioned in section 1.2.

Figure 10
One of the news websites used in the test: Note that the title tag text has been removed from the browser's top bar to prevent it from clashing with the manipulated bookmark text in the search task.

One of the news websites used in the test: the title tag text has been removed from the browser's top bar for the purposes of the test

For each website, a corresponding set of screenshots were created of Internet Explorer 6 with the 'favorites' menu displayed, The bookmark corresponding to the web page was located somewhere on the menu (Figure 11).

Lastly, a questionnaire was prepared to collect demographic data (Appendix D).

The experimental conditions were distributed so that websites 1-4 were followed by bookmarks of condition type 'a' (site name alone), websites 5-8 were followed by bookmarks of condition type 'b' (article title alone), websites 9-12 were followed by bookmarks of condition type 'c' (site name and article title), and so on (see first row, Table 5).

Figure 11
A bookmark menu screenshot used in the test.

A bookmark menu screenshot used in the test (29 bookmarks in total, extending the menu almost to the bottom of the screen)

The presentation order of the websites was counterbalanced so that all conditions occurred equally, and further randomised to eliminate fatigue and practice effects. Lastly, participants were randomly assigned to one of six groups (Table 5), and completed 24 trials, one for each website/bookmark menu combination.

Table 5
Counterbalancing of experimental conditions a - f.
Group Website
1 - 4
Website
5 - 8
Website
9 - 12
Website
13 - 16
Website
17 - 20
Website
21 - 24
1 a b c d e f
2 b a d c f e
3 c d e f a b
4 d c f e b a
5 e f a b c d
6 f e b a d c

The bookmark conditions for each trial were manipulated by taking six screenshots (one for each condition) of the bookmark menu in Internet Explorer. Across all six menu screenshots, the distractor bookmarks were identical, the target bookmark was in the same location but the format was different, depending on the experimental condition (Table 4).

The order of the distractor bookmarks in each menu were changed haphazardly for each trial as well as one or two bookmarks being substituted for new ones each time (See Appendix C for typical set of distractor bookmarks). The target bookmarks appeared once in each position in the menu from number 4 to 27, with 29 bookmarks in the menu each time.

4.3) Apparatus

The website screen shots were presented on a 15 inch flat screen monitor, with a screen resolution of 1024 x 768 pixels.

Eye movements were recorded with an LC Technologies Eyegaze development system. The Eyegaze eye tracker consists of a standard desktop computer running Windows NT/2000, an infrared camera mounted beneath the monitor (Figure 12) and software to process the eye movement data ('Eyegaze development system').

An additional smaller monitor was used to ensure that the eye was in the centre of the camera's field of view. The Eyegaze system determines the eye's gaze direction by the pupil-center/corneal-reflection method. A small LED at the center of the camera lens directs infrared light into the eye, causing a reflection in the cornea and increasing the brightness of the pupil to make it more easily identifiable (Figure 13).

Figure 12
The eye tracker used in the present study: a desktop computer with an infrared camera mounted beneath the monitor.

The eye tracker used in the present study: a desktop computer with a small, unobtrusive infrared camera mounted beneath the flat-screen monitor

Figure 13
The 'bright pupil' effect shown in the eye shown in the infrared camera monitor.

Infrared light directed into the eye increases the brightness of the pupil, making it easier to track

Image processing software is then used to identify and locate the centres of the pupil and corneal reflections. Finally, the gaze point (the coordinates of where the person is looking on the monitor) is found by computing the angle between the corneal reflection and the centre of the pupil.

The eye tracker is accurate to within 0.45 degrees of visual angle, which at 51cm from the screen covers approximately 3.8mm. This corresponds to 12.8 pixels on the monitor used in the test, which had a dot pitch of 0.297mm. Eye movements were sampled 60 times a second, with tracking errors not exceeding 6.3mm.

Although the eye tracker can tolerate head motion of around 3cm in all directions, participants were asked to use a chin rest (Figure 14) to minimise loss of eye movement data. A small wad of tissue is placed in the chin rest to improve comfort, certainly necessary for sessions lasting longer than a few minutes.

Figure 14
A chinrest is essential in keeping head movements to a minimum in order to maintain tracking of eye movements.

The chinrest used in the test: A small, flexible, rubber cup mounted on a 30cm high pole with a stable metal base

Fixations were detected at 100ms or above, an appropriate cut-off point for tracking the movement of the eyes in reading tasks (Hyönä, Niemi & Underwood, 1989; Inhoff & Radach, 1998).

Finally, a monitoring console similar to those used in lab-based usability evaluations was used to observe the participants during the main part of the experiment (Figure 15).

Figure 15
Monitoring console equipped with three CCTV cameras, used to observe participants.

The keyboard, screen, and participant's face were monitored on cctv to maintain experimental controls

4.4) Procedure

On arrival, participants were shown the monitoring console and told that it would be used by the experimenter to monitor the progress of the test while keeping a distance from the participant, so as not to distract them or make them feel self-conscious during the test. It was further explained that none of the video feedback would be recorded (none of the participants objected to this arrangement).

Next, the participants were shown the eye tracker and given a brief explanation of how it worked and why it was necessary to use the chin rest. Participants were then helped to get comfortable for the duration of the test by making appropriate adjustments to the chinrest and the monitor to accommodate individual variations in seated head position. At all times, approximately the same viewing angle between the face and the screen was maintained. Participants were seated on average 51cm from the screen.

Once the participants were comfortable in the chin rest, the camera was adjusted vertically and the participant was asked to move slightly to the left or right so that one of their eyes was in the centre of the camera's field of vision. Lastly, once the camera's focus and aperture was set, the participant was calibrated with the eye-tracker.

The calibration procedure lasts 15 seconds and consists of the participant following a series of 9 dots around the screen with their eyes, starting in various locations. Through this, the system can accurately plot the person's gaze point. Once this profile of the person's eye has been captured, there is no need for them to be calibrated again, even across different test sessions.

In the present study, six participants could not be calibrated due to low contrast between the eye and pupil, large pupils being partially obscured by the upper eyelids, eye reflections being distorted by super-compressed lenses, and partially obscured pupils caused by 'lazy eye'.

Next, custom software was launched which presented participants with on-screen instructions and the experiment itself. After reading the instructions, (Figure 16) the participants completed 4 practice trials while the experimenter sat beside them to answer any queries.

Figure 16
The main instruction screen.

The main instruction screen for participants. Designed to be friendly, easy to understand and scannable

Extra care was taken to check that the participants understood what they had to do before proceeding to the main session. The experimenter went to the far side of the room behind the monitoring console and left the participant to complete the main session without distraction.

Each website appeared on screen for 18 seconds, with each bookmark screen appearing for a maximum of 30 seconds. Participants pressed the space bar on the keyboard to indicate that they had found the target bookmark. If they could not find the target within 30 seconds, the trial ended and the next trial began. Once the participant had reached the end of the main session (lasting around 15 minutes) they were given the questionnaire to complete (Appendix D).

Many significant experimental design issues were solved through a thorough multi-iteration piloting phase.

Back to contents

5. Data Processing

5.1) Essential tools for processing eye movement data

Recording eye movements generates huge amounts of data which have to go through several levels of processing before they can be understood and analysed (Jacob & Karn, 2003).

The raw data are quite dense (see Appendix F) and is best reviewed using a graphical gaze point viewer, which should be supplied with most eye trackers. This type of software can 'play back' eye movements, superimposed over the image that the person was originally viewing, as shown in Figure 17.

Figure 17
Eye movements can be 'played back' using custom software.

Saccades and fixations combine to form a continuous 'trace' of the participant's gaze. The trace can be 'played back' over the original viewed image

Here, fixations are identified by blue crosses at their centre with blue circles indicating the duration of the fixation. Saccades are represented by the red lines connecting the fixations. The complete superimposed eye movement data is called a 'trace'.

5.2) Error correction

Eye trackers can be notoriously sensitive and subject to error, so it is prudent to check the accuracy of all data. In the present study, a reliable pattern of error was detected in the x,y coordinates of the gaze point location.

Firstly, a 'drift' was found in the absolute gaze point of many of the participants. This was most likely due to errors in re-acquiring the image of the eye after it moved in and out of camera range. Absolute drift is easy to spot since all gaze points on the screen are shifted in the same direction by the same amount, with fixation patterns obviously not matching the objects on screen, as shown in Figure 18.

Absolute drift was corrected in the present study by 'dragging' the eye trace back so that the pattern of fixations and saccades matched the layout of the objects on screen, as shown in Figure 19.

Figure 18
An example of 'absolute drift' in the eye movement record: every gaze point has drifted in the same direction by the same amount.

An example of 'absolute drift' in the eye movement record: every gaze point has drifted in the same direction by the same amount

Figure 19
The same eye movement data from Figure 17, after the 'absolute drift' has been corrected. Note how the pattern of fixations and saccades now matches the layout of the objects on screen.

Figure 17 again, after the 'absolute drift' has been corrected. The trace pattern now matches the layout of the objects on screen

Errors in relative gaze point location, however, are more difficult to correct. With relative drift, the eye trace is 'warped' so that some fixation clusters appear to be on target, while others 'fall short' of their apparent target or 'over shoot' them. Figure 20 shows an example of relative warp: Here, we can see a cluster of fixations which matches the address bar, but doesn't quite 'reach' it despite the fact that the fixations on the article title are perfectly centered.

Figure 20
An example of 'relative warp' in the eye movement record: most gaze points match the objects on screen, while others miss their apparent target.

An example of 'relative warp' in the eye movement record: most gaze points match the objects on screen, while others miss their apparent target

When attempting to correct for relative drift, not all the fixation clusters can be re-aligned with their apparent intended targets. In this study, the elements of interest were the name of the site, the title of the article and the section name, so priority was given to adjusting fixation clusters over these regions to maximise data accuracy where it mattered most. Each time the eye trace was corrected, the raw data file was re-written with the new gaze points, ready for the next stage of processing.

In principle the absolute offset should be consistent for each person, meaning that the offset can be corrected once and applied to the rest of the trials for that participant. However, the relative warp was fairly unpredictable, therefore the offsets were corrected manually trial by trial to achieve the best possible accuracy of raw data for the subsequent stages of analysis.

At all times, the eye trace was re-aligned with textual elements so that the first fixation in a phrase fell 3-4 characters to the right of the start of the first word, in accordance with the structure of the perceptual span. In total, 1440 trials were corrected by hand.

5.3) Filtering and analysis

Once the eye movements have been measured, logged and error corrected, the data have to be filtered to enable examination of participants' processing of specific regions of the screen, such as control elements, navigation links and images etc. In the present study the main areas of interest were the site name, section name and article title on both the website and bookmark screens.

The first step in defining the areas of interest was to load the stimuli screenshots into a graphics editing program, then draw to a rectangle around the target element and log the screen coordinates of this rectangle in a data file. The coordinates in this data file were then used by parsing software to count eye movements occurring only in these areas. The x,y screen coordinates in pixels of the top left corner and bottom right corner of the rectangle were recorded as well as it's width and height. Examples of the areas of interest are shown in Figures 21 and 22 below (The yellow borders serve to highlight the location of the areas of interest, and did not appear on the original screenshots).

Figure 21
Areas of interest defined on a web page screenshot (highlighted here by yellow borders).

Areas of interest defined around the site name, section name and article title on a particular web page used in the test

Figure 22
Areas of interest defined on a bookmark screenshot (highlighted here by yellow borders).

Areas of interest defined around the site name, section name and article title on the corresponding bookmark text

Once these areas had been defined, parsing software was used to extract the corresponding eye movement data and format it so that it could be analysed with standard statistical software such as Excel or SPSS.

Back to contents

6. Results

6.1) Response times

Mean response times per condition were derived for each participant and reflected the time taken between the appearance of a bookmark menu and the participant registering that the target bookmark had been detected.

Faster response times were taken as indicating superior recognition (Table 6).

To retain as much data as possible, response times were scored even if participants failed to find the target bookmark. This does not invalidate the results as the response times were not a measure of failure or success in recognising the bookmarks, but rather a measure of relative differences in recognition. If a bookmark was not found, a maximum response time of 30 seconds was scored.

Table 6
Mean times taken to locate target bookmark (seconds)
Bookmark Structure Number of cues
1 2 3 Mean
Site name 13.698 (6.37) 9.901 (5.06) 10.578 (4.29) 11.392
Article title 12.801 (6.05) 11.103 (4.99) 11.487 (5.19) 11.797
Mean 13.250 10.502 11.033

Note. Values enclosed in parentheses represent standard deviation.

Correlations

No significant correlations were found between response time on any condition and the years using the Web or years using computers, as reported by participants in the questionnaire. Also, no significant correlations were detected between the size of the site name logo and the response times for the three conditions of bookmarks with a top-down structure.

ANOVA

A two-way repeated measures ANOVA was used to analyse these data and revealed that there was no main effect of Bookmark structure (top-down vs. bottom-up) (F(1, 29) = 0.155, p = 0.697) but there was a main effect of number of cues (one, two or three) (F(2, 58) = 8.443, p = 0.001).

Employing the Bonferroni post-hoc test, significant differences were found between the one cue and two cue conditions (p = 0.004) and the one and three cue conditions (p = 0.011). However, no significant differences were found between the two and three cue conditions.

6.2) Adjusting eye movement data for phrase length

As the eye movement data was analysed per area of interest, a raw count of fixations would show misleading results as they do not take into account the length of the text phrases contained within the areas. To adjust for this, the mean number of fixations on each element was divided by the mean number of words in the phrase. (2.79 for site name, 6.83 for article title and 1.79 for the section name). In this way, we are able to separate higher fixation frequency due to the simple fact that there were more words to read, and higher fixation frequency because an item is actually harder to recognise.

Mean fixation duration is not contingent on the number words in the phrase, so this measure was not adjusted. It remains a measure of mean fixation duration on the whole area of interest. Finally, eye movements were analysed for 24 of the 30 participants.

6.3) Eye movements during the encoding task

A one-way repeated measures ANOVA was used to analyse mean number of fixations per element (Table 7) and revealed that there was a main effect of the type of element being viewed (adjusted for phrase length) (F(2, 46) = 68.962, p < 0.001).

Table 7
Mean number of fixations per element (adjusted) while browsing the websites.
Area of interest Mean number of
fixations (adj.)
Standard Deviation
Site name 2.41 0.88
Article title 2.09 0.63
Section name 1.08 0.39

Bonferroni post-hoc tests revealed that the element most frequently fixated was the name of the site. It was fixated on average 2.41 times, slightly but significantly (p = 0.12) more often than the 2.09 fixations that fell on the title of the article, and more than double the number fixations than on the section name (p < 0.001) which received 1.08 fixations. The title of the article was also fixated almost twice as frequently as the section name (p < 0.001).

A one-way repeated measures ANOVA was also used to analyse mean fixation duration (Table 8), and a main effect was found according to type of element being viewed (F(2, 46) = 8.948, p = 0.001).

Table 8
Mean fixation duration per element while browsing the websites.
Area of interest Mean fixation
duration (ms)
Standard Deviation
Site name 241 .024
Article title 225 .020
Section name 227 .022

Bonferroni post-hoc tests revealed that the mean fixation duration on the name of the site was slightly longer at 241ms than on the title of the article at 225ms (p = 0.001) and longer than on the name of the section at 227ms (p = 0.021). The mean fixation durations on the article title and the section name were not significantly different.

6.4) Eye movements during the visual search task

In the search task, participants consistently scanned down the left had side of the bookmark menu, as has been found in similar studies of menu search (Altonen, Hyrskykari & Räihä, 1998). Fixations were largely concentrated in the second 8th of the bookmark menu, which corresponds to the first four letters of the first word of each entry (Table 9a). Saccades were also concentrated towards the left of the menu (Table 9b).

Table 9a
Fixations on the bookmark menu per area of interest.
Area of interest No. of Fixations Total Fixation
Time
Mean fixation
time (ms)
1st 1530 492.4 322
2nd 13368 4180.0 313
3rd 5191 1199.1 231
4th 2906 648.7 223
5th 1820 404.1 222
6th 1136 248.5 219
7th 611 134.7 221
8th 117 23.6 202
Table 9b
Saccades occurring in the bookmark menu per area of interest.
Area of interest No. of Saccades Total Saccade
Time
Mean Saccade
time (ms)
1st 297 7.8 26
2nd 8823 147.2 17
3rd 7040 155.8 22
4th 4089 101.5 25
5th 2638 74.7 28
6th 1702 50.5 30
7th 1021 29.8 29
8th 0 0 0

This result provides confirmation that the lead cue in the bookmark does in fact lie in a dominant position. For the purpose of analysis in the present study, we measure eye movements only on the lead cue, and assume that it is a fair proxy for the bookmark structure as a whole.

Number of fixations

A two-way repeated measures ANOVA was used to analyse the mean number of fixations on the leading cues in the bookmark (Table 10). In the present study, a higher number of fixations are an index of greater uncertainty in recognising the target. The analysis revealed that there was a main effect of the bookmark structure (top-down vs. bottom-up) (F(1, 23) = 73.962, p < 0.001).

Table 10
Mean number of fixations (adjusted) on the leading cues of the bookmark.
Bookmark Structure Number of cues
1 2 3 Mean
Top-down 1.34 (.44) 0.87 (.35) 1.00 (.46) 1.07
Bottom-up 0.75 (.22) 0.61 (.19) 0.67 (.27) 0.67
Mean 1.05 0.74 0.84

Note. Values enclosed in parentheses represent standard deviation.

There was also a main effect of the number of fixations (adjusted for phrase length) on the leading cues when extra cues were added (F(2, 46) = 12.259, p < 0.001).

Lastly, there was a significant interaction (F(2, 46) = 4.620, p = 0.015) between the bookmark structure and the number of cues. The number of cues affected fixations differently depending on whether site name or article title was the leading cue.

Post-hoc tests were used to explore the interaction effect further. A stringent alpha level of p < .005 was set to accommodate the fact that multiple comparisons were being made.

When the name of the site was presented alone, it received 1.34 fixations, much higher than the 0.75 fixations that fell on article title when it was presented alone (F(1, 23) = 44.362, p < 0.001). Adding another cue after site name reduced the number of fixations to 0.87 (F(1, 23) = 20.828, p < 0.001), a much larger reduction than adding a cue to article title, which reduced slightly to 0.61 fixations (F(1, 23) = 20.828, p < 0.001). Increasing the number of cues to 3 actually increased the number of fixations on both leading cues, though not significantly.

As indicated previously, bookmarks with a top-down structure received a significantly higher number of fixations at all levels than bookmarks with a bottom-up structure.

Mean fixation duration

In the present study, information which requires longer fixations is less meaningful to the person than information with shorter fixations. A two-way repeated measures ANOVA was used to analyse the mean fixation duration on the leading cues in the bookmark (Table 11). A main effect of the bookmark structure (top-down vs. bottom-up) was revealed (F(1, 23) = 10.437, p = 0.004), as well as a main effect of the number of cues (F(2, 46) = 5.742, p = 0.006).

Table 11
Mean fixation duration (ms) on the leading cues of the bookmark.
Bookmark Structure Number of cues
1 2 3 Mean
Top-down 335 (74) 272 (75) 292 (50) 300
Bottom-up 274 (30) 277 (34) 266 (54) 272
Mean 305 275 279

Note. Values enclosed in parentheses represent standard deviation.

Lastly, there was significant interaction between the bookmark structure and the number of cues (F(2, 46) = 5.948, p = 0.005). The number of cues affected fixation duration differently depending on whether site name or article title was the leading cue.

Post-hoc tests were used to explore the interaction effect further. As before, a stringent alpha level of p < .005 was set to accommodate the fact that multiple comparisons were being made.

When the name of the site was presented alone, it received an average fixation duration of 335ms, 61ms higher than the 274ms that were spent fixating the article title when it was presented alone (F(1, 23) = 44.362, p < 0.001). However, adding an extra cue brought fixation duration on site name down by 43ms to around the same level as for article title, but adding the 2nd and final cue increased the gap again, so that fixation duration on site name was 20ms longer than on the article title alone, although not significantly (F(1, 23) = 8.150, p < 0.01). While fixation duration on the site name continued to be affected by the addition of extra cues, fixation duration on article title was not significantly affected at all.

6.5) Questionnaire results

Basic demographic information was collected through a questionnaire (Appendix D). Responses are described in section 4.1.

Back to contents

7. Discussion

7.1) Was there a difference in saliency?

Response times

Faster response times were taken as indicating superior recognition when participants were searching for a target bookmark within a set of distractor bookmarks. There was no significant difference in the time it took to find either type of bookmark (top down or bottom-up), meaning that in a broad sense, they were both equally salient.

However, the number of cues on display was a significant factor in recognition. Two cues were found to be the optimal length for a bookmark, while one cue was clearly inadequate. Adding a third cue did not bring any significant benefit in terms of recognition. This is most likely to due to the 65 character limit on the bookmark menu - only some or none of the third cue may actually have been visible, negating its usefulness.

This said, there are subtle hints in the response times that top-down bookmarks were much more sensitive to the existence of extra cues than were bottom-up bookmarks. The bottom-up bookmark with one cue had the slowest response time out of all the conditions, but decreased sharply to the fastest response time overall when a second cue was added. This does indicate that the site name may be relatively less salient than the article title - The site name appears to 'need' extra information to spark the same level of recognition that the title of the article can attract on its own. Fortunately, the eye movement data permit us to explore this subtle effect in more detail.

Number of fixations

The assumption in the present study was that higher fixation frequency in a visual search task indicates uncertainty in recognising targets (Jacob & Karn, 2003). This certainly appears to be case for bookmarks with bottom-up structures, which were fixated more frequently than top-down bookmarks, regardless of the number of cues. This clearly shows that bottom-up is more salient than top-down as a bookmark structure.

As was hinted at by the recognition times, the site name was far more sensitive to the existence of extra cues than was the article title, and as before, these extra cues had a diminishing marginal benefit.

Fixation duration

Longer fixation times on particular cues indicate that they are less meaningful (Goldberg & Kotval, 1999). The pattern of fixation durations on the leading cues strongly suggest, as with fixation frequency, that the name of the site is less salient than the name of the article, and by extension, that top-down is less salient than bottom-up as a bookmark structure.

When viewed in isolation, the name of the site was less salient in absolute terms, as it was fixated for far longer than was the article title.

Fixation durations on the title of the article were unaffected by extra information, indicating that it was 'salient enough' with or without extra information. The addition of a second cue to site name reduced the fixation time by quite a large margin, indicating that the site name was much more meaningful when processed in the context of the extra information.

Encoding

When we consider the encoding phase of the test, when participants read through the website in order to identify its bookmark, we saw that site name actually received slightly greater attention than the article title. It was fixated more frequently and for longer on average, serving as further evidence that site name is less salient. Despite being subject to more potentially more encoding, the name of the site was still not as salient as the article title.

7.2) What factors lead to greater saliency and recognition?

Semantic value

Information that can be characterised in terms of existing knowledge structures, or 'schemata' are easier to remember (Alba & Hasher, 1983; Bartlett, 1932). Article titles tell stories of international events, which should have greater potential for being remembered in terms of what people already know about the world (eg, a terrorist attack may always be linked to certain countries in a schema), rather more than an abstract site name. By extension, this should be true of most well-formed page titles that refer to rich content.

Meaning is essential if we are to remember something effectively (Rumelhart & Norman, 1985). The article title tells a story, a scenario - it has a strong intrinsic meaning. Site names however, at least for news websites, probably have a lower capacity for rich meaning as they are simple, yet abstract names unconnected to the news stories they provide,.

Imagery evoking potential

Imaginable and concrete items can be easier to remember as they are represented more richly in memory (Paivio, Yuille & Madigan, 1968). Article titles tend to have more imaginable, concrete words that the site name, which can often be rather abstract, so an advantage in recognition value may arise from this difference. Highly imaginable words are encoded in both verbal and visual channels, and can be recognised more effectively due to the inferential connections that are established between them. Words that are less imaginable may only be encoded verbally, so unable to take advantage of this inferential power in recognition (Clark & Paivio, 1987).

At a broader level, the whole article title tells a story, so for this simple reason it may evoke more imagery than the name of a mundane news website, committing it to memory more effectively as in a mnemonic process. (Clark & Paivio, 1987).

7.3) A critique of the experimental method

Errors

The 'absolute drift' and 'relative warp' that existed in the eye movement data was a great concern and does raise the question of whether the results are valid. However, exhaustive measures were taken to ensure the accuracy of the data. The error factors were well known and every piece of data was checked by hand, potentially ensuring a greater level of accuracy than would have been achievable with a fully automatic analysis.

One solution to reduce the error rate would have been to recalibrate the participants at several times during the test, as some researchers advise (Stampe, 1993). This is indeed an excellent way to ensure data accuracy but it is likely to introduce more artificiality into the test situation. Participants would have been made even more aware of their eye movements, potentially exaggerating trends in the data.

Word-level effects

Readers fixate longer on low frequency words than on high frequency words (Hyönä & Olson, 1995; Inhoff, 1984; Rayner & Pollatsek, 1989), and this may have accounted for some of the longer fixation duration on the site name than the article title.

As the websites in the test were all authentic examples, it is of course true that the bookmark elements were not balanced for word frequency (nor for syntactic difficulty or word length). An informal analysis of the wording used in the target bookmarks (Appendix A) showed no particular extremes in vocabulary.

Furthermore, since the average length of the article