Thursday, January 17, 2019

From Harmonic Structure to HCF to Sample Value, Part 1: Laying the Foundation

I've glossed over this subject previously, but here I'll go into it with greater care, and in greater detail. The context for all that follows is computer software which generates sound (a sequence of sample values) on-the-fly, constrained both by the need to make them available within the time allowed and by the need to minimize the latency experienced by the user of the software, the delay in response to user actions.

So, what is pitch?

My take is that it's very nearly interchangeable with frequency, if perhaps slightly more subjective.

So what is frequency?

Frequency is the rate at which some specific type of event occurs, the number of events over a given time span, or events per unit time. The beating of your heart, for example, would ordinarily be measured in beats per minute. The speed of your car would (in the U.S.) be measured in miles per hour, and the speed of its engine would be measured in revolutions (of the crankshaft) per minute.

Pitch, the frequency of a sound, is typically measured in cycles per second (Hertz, or Hz in abbreviated form). A cycle is a single unit in the repeating pattern of a sine wave, a construct from trigonometry that provides a decent approximation for the manner in which sound is transmitted by successive waves of higher and lower pressure passing through air.

Trigonometry is based around the idea of a unit circle, a circle with a radius of exactly 1. We commonly think of circles as being divided into 360°, but in trigonometry it is more common to express the magnitude of an angle, arc (a partial circle), or rotation in terms of radians. A radian is the same length as the radius, but wrapped around the perimeter of the circle. A unit circle has a circumference of 2pi (2𝜋) or about 6.283185307179586 radians. (Radians are also commonly used in standard functions that compute sine values.)

One cycle of a sine wave is analogous to one complete rotation (2𝜋 radians) of a unit circle. In fact, if you were to roll a unit circle along a horizontal line and track the vertical displacement of some point on that circle by tracing the same vertical position along a vertical line through and moving with the center of the circle, the result would be a sine wave. If you can deal with the 3-dimensional projection, the following animated GIF is an even better visualization. (source: Wikipedia)

What is actually being traced in this animation is a cosine, but the shape is the same as a sine curve.

What all of the foregoing is leading up to is the point that, even in the context of sound, there are other valid measurements of frequency beside cycles per second. We might just as well express frequency in terms of radians per second, if there were any advantage to doing so, and there are even more options.

When dividing one cycle of a sine wave into smaller segments, with the intention of precomputing sine values at evenly spaced intervals and using these to populate an array for later access by means of index values, to avoid the performance hit of having to compute sine values on-the-fly, the number of segments used, which will also be the size of the array, is almost completely arbitrary, at least above a lower threshold at which the division into segments is fine-grained enough to produce acceptable fidelity. One might use 360 segments, or 44,100 segments, the same as the number of samples per second in CD-quality audio. Two other options, 256 and 65536 (2^8 and 2^16 respectively, where '^' is an exponentiation operator), are suggested by potential performance advantages around particular integer math operations. The only downsides to using more, smaller segments are the amount of fast memory consumed by the array, since it must remain in memory whenever sound generation is running, and the time and computation effort (battery power) needed to create the array, if you choose to remove it from memory whenever sound generation ceases.

From here on I will begin using 'array' and 'table' more or less interchangeably. Conceptually, the collection of sine values constitutes a table, but it is necessarily implemented as an array. In this context, both refer to a sequence of index-accessible sine values beginning with sin(0.0) and progressing at even intervals up to but not including sin(2𝜋).

As with various ways of delineating events to be timed, time need not be measured in seconds. Another altogether valid unit of time is the interval between successive audio samples, 1/44100 second in the case of CD-quality audio.

Taken together, the above offers us six different ways of expressing frequency, only one of which is cycles per second. Remember that a cycle is one complete sine wave, that a radian is equal to 1/2𝜋 cycles (about 0.159154943091895), and that the magnitude of sine table indices as a measure of a fraction of a cycle depends upon the size of the array holding the sine values.

  • cycles per second (Hz)
  • cycles per sample
  • radians per second
  • radians per sample
  • sine table indices per second
  • sine table indices per sample

Which of these is chosen has implications for the complexity and performance of the algorithms involved, which will be the subject of the next installment.

No comments:

Post a Comment