Masters Thesis - Issues of Saliency and Recognition in the Search for Web Page Bookmarks
3. Eye Movements
3.1) Eye movement basics
When reading, it feels as though our eyes are moving smoothly along the page in one smooth sweep, but this is an illusion. Our eyes actually move along a line of text in a series of quick jerks called 'saccades', each lasting typically for 20 to 35 milliseconds (ms). After each saccade, the eyes stay relatively still while taking in, or 'encoding' information. These moments are called 'fixations', and they last for 218ms on average, although the range is 66 to 416ms. Sometimes the eyes move back in the direction of text that has already been read - these regressive saccades are known as 'regressions' (Rayner & Pollatsek, 1989). Scan paths are recurring patterns of saccades and fixations - in this study, it is assumed that people adopt a scan path which runs over the far left of the bookmark menu (Altonen, Hyrskykari & Räihä, 1998).
The information available in a fixation is defined by the total perceptual span - it is the region in which letters can be recognised as well as the spaces between words (Figure 9). Due to the anatomy of the eye, visual acuity drops off towards the edges of the perceptual span, providing only lower grade information at the extremes. (Rayner & Pollatsek, 1989; Lansdale & Ormerod, 1994)
Information on word length can be picked up from up to 12-15 characters to the right of the fixation centre and 3-4 characters to the left, although specific letter information can only be detected up to 10 characters to the right of centre. The area in which reliable and accurate word identification takes place is actually 7-8 characters to the right and 3-4 characters to the left of the centre of the fixation. This is known as the 'word identification span' (Ojanpää, Näsänen & Kojo, 2002).
Figure 9
Perceptual span with decreasing acuity at the extremes.
The form of the perceptual span is not set at birth, but learnt according to which language system the person grows up with. For readers of languages that are written from right to left such as Arabic and Hebrew, the perceptual span is reversed, with 3-4 characters to the right and 12-15 characters to the left of the fixation centre (Rayner & Pollatsek, 1989).
3.2) Interpreting eye movements
Eye movements are in many ways a purer measure of recognition (and therefore saliency) than simple response time. Measuring the eye movements on one particular bookmark eliminates the time spent searching for it, as well as the time spent selecting it (Zelinsky & Sheinberg, 1995).
Although some researchers argue that fixations and saccades are programmed automatically regardless of the words being processed (O'Regan, 1992), the assumption for this paper is that these eye movements, especially fixation durations, are an index of the amount of cognitive processing being applied to the item being fixated (Goldberg & Kotval, 1999; Just & Carpenter, 1976).
Fixations
Fixation frequency and duration are the main measures of cognitive activity in the present study. Fixations can be interpreted quite differently depending on the situation. In an encoding task (browsing a web page for example), higher fixation frequency on a particular area can be indicative of greater interest in the target, such as a photograph in a news report, or it can be a sign that the target is complex in some way and more is difficult to encode. (Just & Carpenter, 1976; Jacob & Karn, 2003)
However, these same interpretations are reversed in a search task - a higher number of single fixations or clusters of fixations are an index of greater uncertainty in recognising the target. (Jacob & Karn, 2003).
The duration of a fixation is also linked to the processing time applied to the object being fixated (Just & Carpenter, 1976). It is widely accepted that "representations that require long fixations are not as meaningful to the user as those with shorter fixation durations"
(Goldberg & Kotval, 1999).
Saccades
No encoding takes place during saccades, so they cannot tell us anything about the complexity or saliency of the target phrase. However, regressions can act as a measure of processing difficulty during encoding (Rayner & Pollatsek, 1989). Although most regressions are very small, only skipping back two or three letters, much larger phrase-length regressions can represent confusion in higher-level processing of the text (Rayner & Pollatsek, 1989). Regressions could equally be used as a measure of recognition value - There should be an inverse relationship between the number of regressions and the saliency of the phrase. However, access was unavailable to software that could identify regressions so this type of saccade was left out of the analysis.
A further reason for not analysing saccade data in fine detail is that the eye tracker used in this study was not in fact optimised for measuring saccades - the method it uses to record eye movements defines it as a "fixation picker" (Karn, Goldberg, McConkie, Rojna, Salvucci, Senders, Vertegaal & Wooding, 2000). If the study is replicated using a different type of eye tracker optimised for measuring saccades, (a "saccade picker"), very different data may be produced.
Other eye movements
Pupil dilation is a potentially interesting measurement of cognitive workload (Marshall, 2000; Steinhauer & Hakerem, 1992). A lower cognitive workload may be used as an indication of increased saliency due to lower processing demands (Marshall, 2000). Unfortunately the eye tracking system that was used did not have sufficient resolution or the necessary software to make accurate measurements of pupil dilation.
