Producing a pitch-time graph
What we call a “melody” is a series of sounds of definite pitch, succeeding one another in what we perceive as a continuous (or at least a connected) “stream” that moves up and down in pitch as it moves through time. Musicians often refer to such a stream as a “line” (as in “melody line” or “bass line”), and in global notation it is depicted as an actual line or series of lines extending through a two-dimensional space in which the vertical dimension represents pitch and the horizontal, time (see Relative pitch; Scales and melody). If we know what all the pitches and timings are, we can draw these lines by hand; but if we don’t, and we have a recording of the music we want to notate, we can enlist the help of sound analysis software. When produced by computer software, the line is called a “pitch-time graph.”
With today’s user-friendly software, the process of producing global notation based on a pitch-time graph is not difficult, but it does involve a number of steps which all need to be explained in one page for the whole process to make sense. As a result, this page is quite long. If you don’t feel ready for that, you might care to fortify yourself with your preferred stimulating beverage.
Information about pitch can be useful for determining onset timing too, since onsets are sometimes marked only by a change of pitch, without any other articulation (see Articulation and melisma). But information about pitch is also useful for its own sake, since in most music it matters exactly what intervals are used, and a small difference in relative pitch can make a big difference to the effect (for instance, between sounding “in tune” and “out of tune”). Distinguishing the exact intervals can be difficult even for trained musicians, especially when the music uses a different tuning system than they are used to. But computers can measure pitch very accurately (at least when there is only one pitch sounding at a time), and from these pitch measurements, the intervals can be calculated.
In this way, software can help determine what scale is being used—that is, what are the intervals between the scale degrees—and also what extra-scalar pitches occur, if any. And besides measuring these distinct pitches, it can reveal exactly how the pitch is “bent,” for instance by vibrato and pitch slides (see Pitch bending).
To explore these capabilities, we’ll use a phrase of Quranic recitation. As ethnomusicologists are fond of pointing out, the recitation of texts from the Quran is not categorized as “music” in Islamic thinking, yet its sounds are structured in a way that seems closer to music than it does to speech. Computer-aided notation can help investigate exactly what that means.
In this case, the waveform graph we get from Audacity shows the extended “blocks” of sound characteristic of “sustainable” sound sources such as the human voice, in contrast to “impulsive” ones like drums (see Unspecified duration). As each “block” tends to begin where there is a new syllable in the text, the waveform graph might help when it comes to specifying articulation, but it doesn’t tell us anything about pitch.
To generate a pitch-time graph, the easiest program to use is probably Tony. If you open our Quranic recitation sound file in Tony, you’ll see something like this.
This time the waveform appears in grey towards the bottom of the screen. Above it is a wavy black line enclosed in a series of blue boxes. The black line represents the pitch of the reciter’s voice at any given moment. The blue boxes are meant to identify “notes”—that is, distinct chunks of sound each beginning with what we would call an “onset.”
In vocal music, an onset is usually marked either by a new syllable in the lyrics or by a change of pitch (or both). Tony detects both types of onset, as you can see by comparing the blue boxes with the waveform. The beginning (i.e. left-hand edge) of a blue box often coincides with the beginning of a block of sound in the waveform: this is where a new syllable is articulated. When a new box starts in the middle of a waveform “block,” it corresponds to a significant change of pitch, as indicated by the wavy black line.
The problem is that music, and especially vocal music, doesn’t necessarily consist of “notes” at all. The combination of sustainable sound and pitch-bending capability will often produce a melody line that is better represented by an actual, curving line than it is by a series of discrete “notes.” For instance, a narrow and rapid fluctuation of pitch would be interpreted by Tony (as it would in staff notation) as a single note with “vibrato”; but at what point does a vibrato become slow and wide enough to be interpreted as an alternation between notes of different pitch? The distinction may be imposed by the notation system rather than actually existing in the music. Again, at what point does a slide from one pitch to another result in a new note? In our example, sometimes a new note seems to be indicated simply because the reciter’s sliding pitch has crossed a threshold in Tony’s built-in pitch system, not because a distinct onset is articulated.
The blue boxes might be useful as a step towards notating the music in a system that breaks sound down into “notes,” such as staff notation or the MIDI “piano roll” system, but for the purposes of global notation they are not usually helpful. Fortunately they can be turned off by clicking the “Show Notes” toggle button at the bottom of the screen.
You’ll now see the wavy black line without the blue boxes around it. But with the default settings, you can only see about half of our Quranic recitation phrase at a time. If you want to see the whole thing on one screen, simply click the down arrow on the keyboard to reduce the horizontal time scale until it fits. Then, if the melody line seems to go up and down too much, or not enough, you can adjust the vertical pitch scale by clicking on “View” and then “Set Displayed Frequency Range.” The exact frequency figures here are not important: just remember that the narrower the range of frequencies displayed, the more the melody line will appear to go up and down. How much it should go up and down is quite subjective, but the goal is to finish up with just enough vertical separation to distinguish both the scale degrees and the different sizes of interval between them. After playing around with these settings for a while, you might have a pitch-time graph that looks something like this.
Now make sure the cursor is out of the way (by clicking at the very beginning of the graph) and take a screenshot of this, including the waveform and time scale at the bottom. Paste the screenshot into whatever graphics program you are using to produce your global notation.
You can now edit the image to get it looking the way you want it, for instance by stretching or squashing it vertically or horizontally if you are still not happy with the curvature of the line. For more complete customization, you can “trace” the image, either manually or automatically, to extract the line from the screenshot as a “vector image.” This enables you to edit the line in various ways, such as changing its thickness or applying “smoothing” to get rid of unwanted detail and bring out the essential contour. That’s a bit beyond the scope of this page, but the Wikipedia article on image tracing is a good place to learn about it.
For now, let’s work with Tony’s pitch-time graph just as it is. The next thing to do is to indicate where each new syllable begins. We’ll do this by adding short vertical lines crossing the wavy line that represents the reciter’s pitch, and about equal to it in thickness. This will in effect convert the graph into a series of rotated T symbols with curved and wavy stems. The positioning of our vertical lines will be guided both by the wave-form graph (where the beginning of a “block” usually indicates the start of a new syllable) and by careful listening. While we’re at it, we’ll add a bracket for the time scale, determining the horizontal length of one second from the scale of seconds at the bottom of the screenshot. Once we have done this, the waveform and time scale from Tony have served their purpose, so we might as well crop them out of the graph. Here is what’s left.
One thing that’s apparent from this is how the articulation of a new syllable often doesn’t coincide with a change of pitch. This seems a distinctive feature of the style that was not revealed by Tony’s blue boxes. By showing a new “note” (i.e. a new rotated T symbol) only where a new syllable begins, global notation captures the difference between the two kinds of onset in vocal music, while avoiding the need to decide in particular cases whether a change of pitch is distinct enough to constitute a new note rather than a pitch bend.
Another thing that’s apparent from the pitch-time graph is that, while there are slides and fluctuations of pitch, there are also substantially flat segments where the voice remains on (or at least centered on) a stable pitch. Moreover, some of these stable pitches recur several times in the course of the phrase. That, of course, is our definition of “scale degrees” (see Scales and melody), and to complete our score of this Quranic recitation phrase, we should indicate what scale it uses.
We’ll do this in the usual way, by drawing a horizontal pitch line for each scale degree. But the difference from our earlier examples is that this time we already have our melody line, and we’ll be deriving our scale degrees from that. The procedure is to draw a pitch line into the graph at each height where we see more than one horizontal segment in the melody line. In this case, there appear to be four such pitches, plus a fifth, higher one that occurs only once in stable form, though it does so quite prominently. This might well turn out to be a scale degree if we looked at the rest of the recitation, and we will treat it as such here.
As it would be risky to make assumptions as to which pitch (if any) is the tonic, all the pitch lines are drawn with equal thickness. For the “reference pitch,” we will use the lowest scale degree (see Relative pitch). To determine the pitch of this, we’ll go back to Tony. If you click or hover over any part of the melody line (which Tony calls the “Pitch Track”), some text appears at the top right of the screen telling you the pitch in both Hertz and concert pitch. (When there is fluctuation of pitch, as here, what Tony does is average out the pitch of the “note” that it thinks you have clicked in.)
In this case, the pitch of the last horizontal segment in the phrase, which seems to be the lowest scale degree, comes out as “C3-18c” (18 cents below C3), so we write that in as our reference pitch. The pitches of the other scale degrees are measured in the same way, and their intervals above the reference pitch are calculated in cents.
(Another way to determine the scale degrees is to use software that can identify which pitches occur most frequently, such as Tarsos.)
Our complete “score” of this phrase reveals something of the reasons why Quranic recitation is said to be organized like music even though it’s not categorized as “music” in the Islamic tradition. The intervals between each scale degree and the reference pitch, rounded to the nearest quarter of a semitone, would be 150, 300, 500, and 700—the same as the first four intervals in the scale of our earlier Middle Eastern example (see Scales and melody), which was purely instrumental and would be categorized as “music” by Muslims. In other words, Quranic recitation resembles the “music” of its own culture in a quite specific way: by sharing some of the same scales.
This is the kind of insight that sound analysis software can help us discover and demonstrate. Global notation is able to incorporate computer-generated pitch-time graphs because, like them (but unlike staff notation), it is time-proportional and pitch-proportional: within a piece or section, a given spatial unit always represents the same amount of time (on the horizontal axis) or interval of pitch (on the vertical axis). A series of pitch-time graphs can, of course, be used to notate a longer excerpt or piece, forming multiple “systems” of the score in the same way as manually produced notation.
It should be acknowledged, however, that sound analysis software has its limitations. It lacks the human ear’s ability to distinguish “layers” of sound, and works best with music that uses only one pitch at a time. Best of all is an unaccompanied solo voice, but even then, when the singer takes a breath, the software may interpret background noise as a sound in the music. With instrumental sounds, the more they differ from a singing voice, the more likely the software is to make mistakes, such as showing the right pitch class in the wrong octave. Recognizing that such mistakes are inevitable, programs like Tony do allow for manual correction and try to make it as easy as possible, but you’ll still have to use your ears.
You may also want your computer-aided global notation scores to look different from a raw pitch-time graph, and you’ll probably need to use different software to get the graphs looking the way you want.
Source of audio:
Quran recitation, “Sourat Youssef” performed by al-Shaik ’Abd al-Bast ’Abd al-Samad, Club du Disque Arabe AAA 070, track 1, reproduced in Jonathan Stock, World Sound Matters, Schott CD ED 12572.
Source of software:
Tony software has been developed at Queen Mary, University of London by the authors of the following article:
M. Mauch, C. Cannam, R. Bittner, G. Fazekas, J. Salamon, J. Dai, J. Bello and S. Dixon, “Computer-aided Melody Note Transcription Using the Tony Software: Accuracy and Efficiency,” in Proceedings of the First International Conference on Technologies for Music Notation and Representation, 2015. https://code.soundsoftware.ac.uk/attachments/download/1423/tony-paper_preprint.pdf