Tuesday, January 29, 2019

From Harmonic Structure to HCF to Sample Value, Part 5: Focusing on Pitch Specification and Alteration

Up to this point I've treated the Anchor (and Base Frequency, possibly not mentioned here since 2010) as more-or-less integral aspects of a Harmonic Structure, but really the Anchor only exists to provide a couple of services.

First, and most obviously, the Anchor is a point of reference for specifying the pitches of the fundamentals of the harmonic series composing the structure, and also of the HCF (Highest Common Fundamental). For this purpose it is enough that the Anchor's own frequency be unambiguous. Tuning would simply involve incremental alterations to that frequency.

The other service the Anchor provides is the ability to move a harmonic structure up/down-scale as a unit, by integer-ratio factors. This is what I previously referred to as "Consonant Transposition" on the theory that such a change is likely to be more consonant than using an irrational factor.

There could be other ways to provide these services, of course, including the option of separating the scalar component of the definition of the Anchor's frequency from the integer-ratio component, by bringing back the concept of a Base Frequency.

The Base Frequency would be specified simply using a Double (double precision floating point value), which you could think of as a multiplication factor that is always applied to 1.0 Hertz.

The Anchor would then be specified as an integer-ratio multiple of the Base Frequency.

Tuning would be accomplished by altering the factor relating the Base Frequency to 1.0 Hz, and consonant transposition would be accomplished by altering the ratio relating the Anchor to the Base Frequency.

This seems a little cleaner to me than combining a Double and an integer ratio into a 'dual-component' type, but your mileage may very.

In any case, these details need not be exposed to the user! What matters is that the pitches of the fundamentals of the series composing the harmonic structure are tunable as a unit and editable by integer-ratio factors, collectively as well as individually, and that those pitches as well as that of the HCF are clearly specified.

Sunday, January 27, 2019

From Harmonic Structure to HCF to Sample Value, Part 4: Focusing on Phase & Phase Advancement

So maybe you're a little hazy on what is meant by phase, even more so regarding phase advancement, and not at all convinced I know what I'm talking about in suggesting that repeatedly multiplying the phase of a lower frequency by a positive integer can be used to generate a higher frequency. Like, how does that work?

Phase relates back to the sine wave, which itself relates back to the unit circle, but this is beginning to feel like a circular definition. What does it really mean?

Let's approach this from a different direction, using an analogy. Say you have a shaft, rotating at one degree per second. It's going to take that shaft 360 seconds to complete one rotation. Now say you have another shaft, the position of which is updated once per second according to the rule that its new position should be twice that of the first shaft. If the first shaft has moved 10 degrees, the second shaft will have moved 20 degrees. If the first shaft has moved 50 degrees, the second shaft will have moved 100.

But what happens when the first shaft has moved 180 degrees and the second shaft has moved 360 degrees? The second, faster shaft is already back where it started while the first shaft is still only halfway around. Fine, no problem, it's free to keep right on moving, starting a second rotation while the first shaft finishes its first, but because doubling the number of degrees the first shaft has turned will now result in a number larger than 360, we'll need to remove the first 360 degrees to bring the result into a range we can work with. So, essentially, when it gets to 360 degrees the second shaft resets to 0 degrees and keeps on moving.

Likewise, when the first shaft gets to 360 degrees, it also resets to zero and keeps moving.

But what if for every degree the first shaft moves the second shaft moves 5 degrees. The same principle applies, but because we're getting the position of the second shaft by multiplying the position of the first shaft by 5, it won't be enough to subtract 360 degrees after its first rotation, we'll need something that will work no matter how many rotations it has already completed. That something is modulo division.

In this example, after multiplying the position of the first shaft by 5 we'll take the result of that and apply modulo 360, to remove all of the full turns and leave only the amount by which the second shaft's new position exceeds a full turn. We could use the same approach for the first shaft, but in that case it's simpler to just subtract 360 degrees every time it completes a full rotation.

You may recall, in a previous installment I said that if you measure phase (rotation) in cycles, modulo division isn't necessary. This is because if we were to use modulo division in that case, it would be modulo 1.0, which is exactly equivalent to simply keeping the fractional portion of a decimal number and discarding everything to the left of the decimal point.

So, to ease back into more standard terminology, phase equates to how much the rotation of a shaft, at any given point in time, exceeds an indeterminate number of complete rotations. How far beyond the start/end point of a cycle it has progressed, and phase advancement equates to how much rotation occurs between one point in time and the next, one second and the next in the above example. It is a rate of change.

Note that in the above example we only applied phase advancement to the first shaft, to determine its phase at the next point in time, and used that to calculate the phase at the same point in time for the second shaft. The rate of phase advancement for the second shaft is only implied, never explicit.

Using this approach we might add a third shaft, applying the same multiplier to the phase of the first shaft as we did for the second shaft, and be confident that the second and third shafts would always be perfectly synchronized, rotating in lockstep.

A cycle is a cycle, whether it's a sine wave or a rotating shaft or the interplay of the tilt of Earth's rotational axis with its movement around the sun, creating seasons.

Phase is what portion of the next full cycle has been completed, and phase advancement is the rate of change of the phase, change/time. For a shaft, phase advancement is how fast it is turning. For a sound, phase advancement is its frequency, its pitch. For Earth's seasons, phase advancement is how quickly one passes into the next.

If you were confused before, I hope that you are now at least less confused.

Tuesday, January 22, 2019

From Harmonic Structure to HCF to Sample Value, Part 3: Clarifying Terminology

This is very much a work in progress. No doubt the list will grow over time, as inspiration strikes and I have time to give to it. Some items link to Wikipedia (or other) articles, and some of those might not be included except that the articles they link too are so well done and include relevant material.

Array
A common way of structuring data, a list of items, usually all of the same type.
Big O notation
A standard method of expressing the computational complexity of an algorithm.
ADSR Envelope
Attack: the initial, usually abrupt escalation of volume at the beginning of a note.
Decay: the rapid loss of some of that volume immediately following the attack phase.
Sustain: a period of more stable volume following the decay phase.
Release: the final attenuation of volume to zero.
Anchor
My name for an intermediary object used to establish the frequencies of the fundamentals of the harmonic series composing a harmonic structure, and the frequency of their Highest Common Fundamental. The frequency of the Anchor is specified by the combination of two factors multiplied together, a scalar and an integer ratio.
Base Frequency
My name for an intermediary object which may be used in conjunction with the Anchor, providing the scalar factor.
Beat Frequency
A periodic variation in volume at a rate that is the difference between the frequencies of two simultaneous tones.
C-family Programming Languages
For the present purpose, C, C++, and Objective-C.
Callback
Code you provide to a framework which it calls when the conditions are right or when the time comes.
CD Quality
Two channels of 16-bit integer values at 44100 samples per channel per second.
Consonance
A quality of "simultaneous or successive sounds...associated with sweetness, pleasantness, and acceptability" best exemplified by chords composed of frequencies all related by ratios of small integers.
Consonant Transposition
Moving a harmonic structure up/down-scale as a unit, by an integer-ratio factor.
CPU Cycle
Not exactly a precise unit of measure, because various instructions take differing amounts of time to complete, because multiple instructions may be 'in-flight' simultaneously, and because it is becoming increasingly common to offload much of the work to coprocessors better adapted for particular classes of algorithms. Even so, it still works as a rough measure of computational effort.
Cycle
One repetition of a repeating pattern or event.
Cycles per Second
The number of repetitions of a repeating pattern or event with each passing second.
Digital Audio
The encoding of audio signals into or their synthesis in digital form, subsequent processing, and decoding to analog signals to drive speakers.
Double Precision
A floating point number with relatively high precision, usually occupying 64 bits.
Floating Point Number
A means of expressing very large, very small, and fractional values.
Frequency
The rate of repetition of a repeating pattern or event; for sound usually expressed in cycles per second (Hertz or Hz).
Fundamental
The lowest member of a harmonic series, every other member of the series being an integer multiple of the fundamental.
Harmonic
A member of a harmonic series, an integer multiple of the fundamental.
Harmonic Number
An integer representing both the factor by which the frequency of the fundamental of a harmonic series is multiplied to produce the frequency of a particular harmonic and the position of that harmonic within the series, where the fundamental itself is the first harmonic.
Harmonic Series
A sequence of integer multiples of a fundamental, of a fundamental frequency in the context of sound.
Harmonic Structure
Two or more harmonic series the fundamentals of which are related by integer ratios, having members with the same frequency at different harmonic numbers (although these may occur at harmonic numbers too high for inclusion in a given implementation).
Hertz (Hz)
Cycles per second.
Highest Common Fundamental (HCF)
The highest frequency which can serve as the fundamental of a harmonic series including every member of every harmonic series constituting a harmonic structure.
Index (plural: Indices)
A means of specifying a particular member of an array.
Integer
A whole number: ..., -3, -2, -1, 0, 1, 2, 3, ...
Integer Ratio
A ratio in which both the numerator and denominator are positive integers. In the context of ratio-based music, ratios composed of small integers are strongly preferred.
Intensity
An abstract representation of volume, which may or may not scale linearly.
Inverse (multiplicative)
The result of reversing the numerator and denominator of a ratio.
Modulo Division
Extraction of the remainder from a division, as opposed to its truncation or expression as a fractional result.
Note
An instance of a tone, generated either programmatically or in response to a user event.
Phase
The state of completion of the current cycle of a repeating pattern or event.
Phase Advancement
The amount by which the phase changes between one point in time (one sample) and the next.
Pi (𝜋)
The ratio between the circumference and the diameter of a circle.
Pitch
Used interchangeably with frequency, but occasionally with the suggestion of subjectivity.
Radian (rad)
The angle traversed by wrapping the radius of a circle around its circumference; commonly used as the unit for an argument in functions that calculate trigonometric values.
Ratio (fraction)
A proportionality between two quantities, calculated by dividing one (the numerator or dividend) by the other (the denominator or divisor), using a variation on division that preserves any remainder as a fractional component of the result, for example a quotient of type Double.
Real-time
Any computational context where both the initiation and completion of a sequence of operations are time-constrained to the extent that efficiency becomes a high priority.
Sample
A single value, representing a single instant, in a sequence of values composing a digital audio signal.
Sample Rate
The number of samples per second composing a digital audio signal.
Secondary Harmonics
The harmonics of a member of a harmonic structure.
Sine
A repeating trigonometric function.
Sine Wave
A graph of the sine function, and, by analogy, any phenomenon having a similar pattern, like sound.
Sound
The sensory experience of a sound wave.
Sound Wave
Propagating variations in air pressure, or a graph of those variations.
Table
An ordered list of values of the same type, frequently implemented as an array.
Tone
Used interchangeably with frequency, but occasionally with the implication of a voice being applied to that frequency.
Truncation
Discarding the fractional portion of a floating point value, as when performing conversion to an integer. Also discarding the remainder in integer division.
Unit Circle
A circle with a radius of 1.0, frequently centered on the origin of a two-dimensional coordinate system (x = 0.0 and y = 0.0); the foundational concept for much/most of trigonometry.
Unsigned Integer
An integer with no sign bit, representing a value that is greater than or equal to zero.
Voice
Any attributes in the synthesis of a note other than its basic frequency and the overall volume, for example the ADSR Envelope or emphasis on different harmonics as the note progresses.
Zero-based Indexing
The first element of an array has index 0.

Feel free to comment with suggestions, terms to include and/or definitions, or if you disagree with a definition I've supplied. If I use a definition that you've supplied, I'll provide attribution by linking to the comment, unless you specify that I should not do so.

Sunday, January 20, 2019

From Harmonic Structure to HCF to Sample Value, Part 2: Multiples of Sine Phase Advancement per Time

Beginning in Part 5 of the previous series, I've already gone into some detail regarding what I've termed the Highest Common Fundamental (HCF). I may revisit this, but that existing explanation seems adequate for the present purpose.

The main reason for caring about the HCF, perhaps the only reason, is that it can be used to generate any tone in the harmonic structure associated with it. To achieve this, some conceptual agility is required.

The first step is to determine the position of the HCF relative to some reference which is generally stable with regard to the harmonic structure (the Anchor), expressed as an integer ratio, and to use that ratio to determine its frequency. Any change to the structure will necessitate recalculation of this ratio and the resulting frequency.

Next, that frequency is recast as a rate of sine phase advancement. The units for this are the same as for frequency, and, as mentioned in the previous installment, there are various ways of expressing this:

  • cycles per second (Hz)
  • cycles per sample
  • radians per second
  • radians per sample
  • sine table indices per second
  • sine table indices per sample

The default choice for specifying the frequency of the HCF is cycles per second (Hz), but those may not be the most appropriate units for specifying the HCF's rate of sine phase advancement. Let's take a closer look at how we'll be using that quantity.

When sound generation starts, we'll be setting the phase of the HCF in motion. For each sample, it will be advanced by an amount determined by the frequency. If that amount is expressed 'per sample' rather than 'per second' the advancement can be a simple addition, with a check for exceeding (>=) 1.0 cycles, 2𝜋 radians, or the number of elements in the sine table (and, if that check returns true, subtracting 1.0 cycles, 2𝜋 radians, or the number of elements in the sine table).

Since we'll be using the phase on a 'per sample' basis, let's remove the 'per second' options from the list, leaving us with:

  • cycles per sample
  • radians per sample
  • sine table indices per sample

To produce the contribution of a particular harmonic to a single sample, we'll multiply the phase (cycles, radians, or sine table indices) of the HCF for that sample by the harmonic number (in terms of the HCF) of the harmonic we want to generate, extract from that product just the portion by which it exceeds the nearest multiple of 1.0 cycles, 2𝜋 radians, or the number of elements in the sine table (modulo division, or the equivalent), and translate that into an index into the sine table to retrieve a sine value.

For phase expressed in cycles, instead of using modulo division, from the product of the first step above we can simply extract the fractional portion (x - trunc(x)), multiply that by the number of elements in the sine table, and truncate that result to produce a usable index.

For phase expressed in sine table indices, modulo division by the number of elements in the sine table is necessary, but once that's done a single truncation is all that's required to produce an index for table lookup.

Phase expressed in radians has neither of these advantages. It requires both the modulo division and multiplication by a conversion factor, followed by truncation, so let's eliminate it, leaving us with just two choices — cycles or sine table indices.

It comes down to which is more expensive (in terms of cpu cycles), modulo division or an additional truncation, a subtraction, and a multiplication. That seems like a pretty easy call, modulo division is probably several times more expensive than the combination of three fast operations. This might seem trivial, but if you want to be able to generate multiple simultaneous tones, each composed of multiple secondary harmonics, 44100 times per second, wringing out those extra cpu cycles becomes important.

So, the winner is phase expressed in cycles and phase advancement in cycles per sample.

Now that we have our units nailed down, let's make another pass through the context and the process of arriving at sample values. The Anchor is like a handle, a convenient point of reference which is nominally stable with regard to the Harmonic Structure, at least between changes to that structure. The Highest Common Fundamental (HCF) is a downward projection of the structure; it cannot be higher than the lowest fundamental of a harmonic series included in the structure, and would typically be even lower, very possibly subsonic. While its position is dictated by the structure, the HCF is defined in terms of the Anchor, by means of an integer ratio, which is used to determine its frequency, in cycles per second (Hz). That frequency is then re-expressed in terms of cycles per sample (simply divide by the sample rate in samples/second), which are also appropriate units for per sample phase advancement.

Everything up to this point is only done once, unless the harmonic structure itself is edited, in which case it is done again. What we now have is an HCF defining a harmonic series which includes every member of the harmonic structure, including all of their secondary harmonics, and a rate of phase advancement for the HCF. We could at this point, calculate rates of phase advancement for each member of the structure and launch separate phase tracking for each in response to user events, but this would result in entirely random interference patterns. So, instead, we will launch phase tracking for the HCF alone, and calculate phase alignment on a per sample basis for each member of the structure which is currently participating in sound generation.

Because we have chosen to express phase (advancement) in cycles (per sample), this calculation is as simple as multiplying the phase of the HCF for the current sample by the harmonic number of the member (in the harmonic series defined by the HCF), and keeping only the fractional portion of the resulting value, the part after the decimal point (x - trunc(x)). That fractional portion of the product is multiplied by the number of elements in the sine table, and the result of that converted to an integer using a truncating initializer.

This index is then used to retrieve a sine value from the table, which is then multiplied by a volume factor (calculated separately) for that member and the current sample, and these values for each of the currently participating members are added together to produce the overall sample value.

Note that the list of currently participating members and the volume factor for each must be maintained on a per-sample (or at the very least per-callback) basis, and either copied into the real-time context or modifications based on user events saved until the real-time callback is done with them, by means of some simple locking mechanism. Even this is probably best done in C or C++, with your Swift code passing in user events by calling C/C++ functions.

For best effect, you'll probably want to either insert an equalizer downstream or incorporate the function of an equalizer into the calculation of volume factors. The latter approach seems preferable, since it removes some load from the real-time pipeline, but equalizers and UI to match are readily available plug-ins, so that might be one optimization too many, at least initially.

That's it in a nutshell, although I may have more to say about specifics as I get further into it myself.

Thursday, January 17, 2019

From Harmonic Structure to HCF to Sample Value, Part 1: Laying the Foundation

I've glossed over this subject previously, but here I'll go into it with greater care, and in greater detail. The context for all that follows is computer software which generates sound (a sequence of sample values) on-the-fly, constrained both by the need to make them available within the time allowed and by the need to minimize the latency experienced by the user of the software, the delay in response to user actions.

So, what is pitch?

My take is that it's very nearly interchangeable with frequency, if perhaps slightly more subjective.

So what is frequency?

Frequency is the rate at which some specific type of event occurs, the number of events over a given time span, or events per unit time. The beating of your heart, for example, would ordinarily be measured in beats per minute. The speed of your car would (in the U.S.) be measured in miles per hour, and the speed of its engine would be measured in revolutions (of the crankshaft) per minute.

Pitch, the frequency of a sound, is typically measured in cycles per second (Hertz, or Hz in abbreviated form). A cycle is a single unit in the repeating pattern of a sine wave, a construct from trigonometry that provides a decent approximation for the manner in which sound is transmitted by successive waves of higher and lower pressure passing through air.

Trigonometry is based around the idea of a unit circle, a circle with a radius of exactly 1. We commonly think of circles as being divided into 360°, but in trigonometry it is more common to express the magnitude of an angle, arc (a partial circle), or rotation in terms of radians. A radian is the same length as the radius, but wrapped around the perimeter of the circle. A unit circle has a circumference of 2pi (2𝜋) or about 6.283185307179586 radians. (Radians are also commonly used in standard functions that compute sine values.)

One cycle of a sine wave is analogous to one complete rotation (2𝜋 radians) of a unit circle. In fact, if you were to roll a unit circle along a horizontal line and track the vertical displacement of some point on that circle by tracing the same vertical position along a vertical line through and moving with the center of the circle, the result would be a sine wave. If you can deal with the 3-dimensional projection, the following animated GIF is an even better visualization. (source: Wikipedia)

What is actually being traced in this animation is a cosine, but the shape is the same as a sine curve.

What all of the foregoing is leading up to is the point that, even in the context of sound, there are other valid measurements of frequency beside cycles per second. We might just as well express frequency in terms of radians per second, if there were any advantage to doing so, and there are even more options.

When dividing one cycle of a sine wave into smaller segments, with the intention of precomputing sine values at evenly spaced intervals and using these to populate an array for later access by means of index values, to avoid the performance hit of having to compute sine values on-the-fly, the number of segments used, which will also be the size of the array, is almost completely arbitrary, at least above a lower threshold at which the division into segments is fine-grained enough to produce acceptable fidelity. One might use 360 segments, or 44,100 segments, the same as the number of samples per second in CD-quality audio. Two other options, 256 and 65536 (2^8 and 2^16 respectively, where '^' is an exponentiation operator), are suggested by potential performance advantages around particular integer math operations. The only downsides to using more, smaller segments are the amount of fast memory consumed by the array, since it must remain in memory whenever sound generation is running, and the time and computation effort (battery power) needed to create the array, if you choose to remove it from memory whenever sound generation ceases.

From here on I will begin using 'array' and 'table' more or less interchangeably. Conceptually, the collection of sine values constitutes a table, but it is necessarily implemented as an array. In this context, both refer to a sequence of index-accessible sine values beginning with sin(0.0) and progressing at even intervals up to but not including sin(2𝜋).

As with various ways of delineating events to be timed, time need not be measured in seconds. Another altogether valid unit of time is the interval between successive audio samples, 1/44100 second in the case of CD-quality audio.

Taken together, the above offers us six different ways of expressing frequency, only one of which is cycles per second. Remember that a cycle is one complete sine wave, that a radian is equal to 1/2𝜋 cycles (about 0.159154943091895), and that the magnitude of sine table indices as a measure of a fraction of a cycle depends upon the size of the array holding the sine values.

  • cycles per second (Hz)
  • cycles per sample
  • radians per second
  • radians per sample
  • sine table indices per second
  • sine table indices per sample

Which of these is chosen has implications for the complexity and performance of the algorithms involved, which will be the subject of the next installment.