On the clarity of the music stave

Arnold J Wilkins and Isobel Kiff

Department of Psychology

University of Essex


The staves of 63 scores from 50 music publishers were measured. The staves had similar

height but with lines that varied in thickness by a factor of 4 from 0.1-0.4mm. Evidence from

visual psychophysics suggests that when the stave has thick lines, perceptual distortions are

likely to affect the clarity of the score adversely. Music students were asked to sight read

scores comprising random notes (12 tone) or random notes in the key of G. The scores had staves with lines that were 0.1 or 0.4mm thick (the extremes of current typographic practice). Twice as many errors were made when the staves had thick lines although the scores were read more slowly. Scores in the key of G were read more accurately than the 12 tone scores, but those with thick lines were read with as many errors as 12 tone scores with thin lines. There was a tendency for individuals with high pattern glare scores to read the scores with thick lines relatively slowly. The findings suggest that perceptual distortions can impair sight reading of music manuscript because of the pattern from the lines of the stave;

using thinner lines increases sight reading accuracy and speed.

Keywords: visual stress, illusions, stave, clarity, typography, sight reading 2 Within the Western tradition of music notation the stave began as a single line and then had companion lines added to it. These defined the pitch of the notes, and eventually became standardised for plainsong as a stave of four lines, sometimes with different colours. Standardisation of the five-line stave did not occur until the seventeenth century. Until then, different composers contemporary with one another used a wide variety of staves. In the organ music of Frescobaldi (1583-1643), for example, the stave for the right hand had six lines with eight lines for the left;

Sweelinck (1562-1621) used two staves of six lines and Scheidt (1587-1654) used four staves of five lines each. By the middle of the seventeenth century the five line stave had become nearly universal except for plainsong, which to this day continues to use a four line stave; for review, see Apel (1942).

In this paper we show that although the five line stave is nearly universal it is not necessarily optimal for sight reading because of the perceptual distortions that the pattern of lines evokes.

In 1832 Sir David Brewster observed:

If the eye looks at (parallel) black lines drawn upon white paper steadily and continuously, the black lines soon lose their straightness and parallelism, and inclose luminous spaces somewhat like the links of a number of parallel chains. When this change takes place, the eye which sees it experiences a good deal of uneasiness... When this dazzling effect takes place the luminous spaces between the lines become coloured, some with yellow and others with green and blue light (Brewster, 1832, p. 170).

There are large differences between people as regards their susceptibility to the above perceptual distortions, which have been termed visual stress (Wilkins, 1995). Anecdotally, some accomplished musicians known to the authors are unable to sight read because of the distortions they see when

trying to read music. The following is a written description of the distortions by a chorister:

The bars either side of the one I am looking at appear a mess – the notes oscillate slowly up and down and I cannot say how many notes there are. I cannot see the lines, so I’m not aware of them moving. The lines become a single haze. (Personal communication, 26 October 2013) It has been proposed that the distortions arise in the visual cortex of the brain (Wilkins et al., 1984).

Individuals with migraine are particularly susceptible, perhaps because migraine is associated with a cortical hyperexcitability (Wilkins, 1995; Huang et al., 2011). Only a few individuals will be aware of distortions in a page of music, but if the lines are thicker, and there are more of them, most (but not all) individuals will experience the distortions that Brewster described (Wilkins, 1995). The reader may not be aware of instability in Figure 1A with few thin lines or even 1C with few thick lines, but most observers find Figure 1D with many thick lines uncomfortable and unstable to view (Wilkins et al., 1984).

Insert Figure 1 about here

The distortions interfere with perception quite generally, so that it becomes difficult to see faint targets superimposed upon a pattern of lines. Chronicle and Wilkins (1996) measured the interference. They optically superimposed faint letters or shapes on patterns of stripes and measured the contrast at which these targets could just be seen, i.e. the threshold contrast, comparing the threshold with different patterns of stripes. They showed that it was more difficult to discern the targets when the stripes had characteristics such that they induced perceptual distortions.

Pattern parameters

In the experiments by Wilkins et al. (1984) and Chronicle and Wilkins (1996) the perceptual distortions and interference with vision were both affected by three important parameters of the pattern: (1) its spatial frequency, that is, the number of pairs of black-white stripes (i.e. cycles of the pattern) occupying one degree subtended at the eye, see Figure 1E; (2) its duty cycle, i.e. the separation of the lines expressed as a proportion of one cycle of the pattern, see Figure 1F; (3) its size, the angle the radius of the pattern subtended at the eye, see Figure 1G.

Spatial frequency. The spatial frequency at which the illusions and interference were maximal was about 4 cycles per degree, that is, when four pairs of black and white stripes of the pattern subtended one degree at the eye, or alternatively, when each pair of black and white stripe subtended 15 minutes of arc at the eye, see Figure 2A and 2B.

Insert Figure 2 about here

Duty cycle. The duty cycle at which illusions and interference were maximal was 50% (when the black lines and the white spaces between them had equal width). The interference decreased progressively as the duty cycle departed from this value, i.e. as the black lines became thinner or thicker than the white spaces, see Figure 2C and 2D.

Pattern size. The size of the pattern was critical: the larger the pattern, the greater the illusions and interference, but the increase was determined by the area of the visual cortex to which the pattern projected. In Figures 2E and 2F, the horizontal axis shows the proportion of the visual cortex stimulated, and beneath it the size of the pattern expressed as the number of degrees the radius of the pattern subtended at the eye of the observer.

The question arises as to whether music scores provide a pattern with parameters appropriate for the induction of perceptual distortions. We therefore surveyed typographic practice in contemporary music publishing to determine the parameters of pattern size, spatial frequency and duty cycle that are typical for the lines of the stave. We surveyed popular songs, but have since found that the measurements apply quite generally.

Survey of contemporary publishing practice

Sheet keyboard music for popular songs published between 1957 and 2008 was sampled in 63 publications: 50 different publishers from the USA and Europe were represented in the sample, and are listed in the Supplementary Materials. A measuring magnifier was used to obtain measurements to the nearest 0.05mm.

Two measurements were taken from each sample: the height of the staves and the thickness of the lines of the staves. The average height of the staves was 6.9mm (SD 0.4mm). Figure 3A shows a histogram of the heights of the staves: as can be seen, they ranged in height from 5.6 to 7.6 mm and the most common height was 7.0mm. More than 75% of the sample had staves measuring between

6.6 and 7.2mm in height (a range of only about 10%). There was therefore considerable consistency from one publisher to another with respect to the height of the staves in published music scores.

Insert Figure 3 about here

By way of contrast, there was little consistency among publishers with respect to the thickness of the lines of the staves. The average thickness of the lines was 0.27mm (SD 0.09). The thickness varied from a minimum of 0.10mm to a maximum of 0.45mm, a range of more than 400%, see Figure 3B. Any tendency for thick lines to be more widely spaced was very weak: the Pearson correlation between thickness and separation was 0.04 and not significant.

The survey of music scores covered the years between 1957 and 2008. Despite the changes in printing technology over this period, there was a little tendency for the thickness of the stave lines to have changed. The Pearson correlation coefficient between line thickness and date of publication was -0.17 and not significant.

The distance from which music is viewed varies somewhat as a result of the constraints provided by the instrument being played. For the present purposes we can estimate the viewing distance as ranging from the near point of (clear) vision (about 0.4m) to the maximum distance at which the score can be read, about 0.6m. The average height of the staves was 6.85mm, which equates to 0.98 degrees at 0.4m and 0.65 degrees at 0.6m. Each stave had 5 cycles so the spatial frequency of the pattern ranged from about 5 to about 8 cycles per degree. This is close to the spatial frequency for which perceptual distortions are maximally likely, see Figure 2A, and for which interference with perception is maximal, see Figure 2B.

By convention, the duty cycle of a pattern is expressed in terms of the extent of the bright part of the cycle as a proportion of the entire cycle, see Figure 1F. In this context the duty cycle is therefore the distance from the top of one line to the bottom of the line above divided by the distance from the top of one line to the top of the line above. The mean duty cycle was 83.7% and ranged from 73.7% to 94.2%. The lower part of this range is within the range of duty cycles that evoke perceptual distortions, see Figure 2C and 2D. Over this range, the contrast necessary to see a target superimposed on the pattern varied by a factor of 2, see Figure 2D.

The staves on a page of music typically measure about 200mm wide by 300mm high. The radius of the pattern when centrally fixated (10 degrees) is therefore quite sufficient to induce distortions, see Figure 2A and 2B. Of course the pattern of periodic lines is broken by the spacing between the staves, which reduces the effect of the pattern. On the other hand, each stave on its own subtends about 0.7 degrees, and patterns of this size can in some individuals be sufficient to induce distortions (Wilkins 1995).

Of the three parameters considered thus far, spatial frequency, duty cycle and size, duty cycle is the one that varies most from publisher to publisher. The typical values range from those that are unlikely to induce distortions to those that are likely to do so. The effects of any striped pattern can be greatly reduced by increasing its duty cycle. By increasing the duty cycle of the lines of the staves, making the spaces larger and the lines correspondingly thinner, it may be possible to improve the speed and accuracy of sight reading, even in individuals who are unaware of any distortions. The following experiment put this idea to the test. Students were asked to sight read music printed with stave lines 0.1mm thick and with lines four times as thick (0.4mm). (Both these line thicknesses are within the range of conventional typographic practice, as demonstrated earlier.) The speed and accuracy of the performance were measured.

Sight reading from staves with thin and thick lines


Participants. An opportunity sample of 28 students (6 male, 22 female) took part. They were not all music students but all had certification from one of the three UK registered exam boards: Royal Schools of Music, London College of Music and Victoria College of Music. The mean grade in piano was 3.0 and the range 1-7.

Materials. A random sequence of notes in the key of G was generated by an algorithm that randomly sampled the notes on a major scale of G. The mean of the distribution was C4. The registral range was from C3 to B5. A similar algorithm generated a random sequence from a chromatic scale.

The “black keys” were randomly designated as sharp or (the equivalent) flat. The mean of the distribution was F4. The registral range was from C3 to Bb4.

Two random sequences of notes were generated for the key of G and two for the chromatic sequence. A music processing package (Sibelius(R) www.avid.com) was used to generate a score based on the selected notes. Each score was prepared in music having staves with lines that when printed were 0.1mm wide (“thin”) and in music having staves with lines that when printed were

0.45mm wide (“thick”). Sixty bars, each with 4 crotchets per bar, were laser printed on a single sheet of white A4 paper. Figure 4 shows the first line of some representative scores. The random sequence of notes ensured that participants had to see the notes in order to read them; they could not predict the sequence from a melody.

Insert Figure 4 about here

Procedure. The two scores (scale of G and chromatic) and the two versions of each (thick and thin staves) were counterbalanced across participants. Each participant experienced the experimental conditions in one of four possible sequences, 7 participants per sequence. The test order minimised the effects of practice at the task, while balancing the differences in scores across participants.

Participants were seated at a keyboard (Yamaha PSR-420) at a distance of about 0.5m from the score. Each participant was allowed 30 seconds to review the score prior to playing it. They were asked to play at a pace that was comfortable, and that allowed them to play accurately. Participants then proceeded to play each score and were recorded whilst doing so. The time taken to play each piece and the number of incorrectly played notes were recorded.

