Beginning in Part 5 of the previous series, I've already gone into some detail regarding what I've termed the Highest Common Fundamental (HCF). I may revisit this, but that existing explanation seems adequate for the present purpose.
The main reason for caring about the HCF, perhaps the only reason, is that it can be used to generate any tone in the harmonic structure associated with it. To achieve this, some conceptual agility is required.
The first step is to determine the position of the HCF relative to some reference which is generally stable with regard to the harmonic structure (the Anchor), expressed as an integer ratio, and to use that ratio to determine its frequency. Any change to the structure will necessitate recalculation of this ratio and the resulting frequency.
Next, that frequency is recast as a rate of sine phase advancement. The units for this are the same as for frequency, and, as mentioned in the previous installment, there are various ways of expressing this:
- cycles per second (Hz)
- cycles per sample
- radians per second
- radians per sample
- sine table indices per second
- sine table indices per sample
The default choice for specifying the frequency of the HCF is cycles per second (Hz), but those may not be the most appropriate units for specifying the HCF's rate of sine phase advancement. Let's take a closer look at how we'll be using that quantity.
When sound generation starts, we'll be setting the phase of the HCF in motion. For each sample, it will be advanced by an amount determined by the frequency. If that amount is expressed 'per sample' rather than 'per second' the advancement can be a simple addition, with a check for exceeding (>=) 1.0 cycles, 2𝜋 radians, or the number of elements in the sine table (and, if that check returns true, subtracting 1.0 cycles, 2𝜋 radians, or the number of elements in the sine table).
Since we'll be using the phase on a 'per sample' basis, let's remove the 'per second' options from the list, leaving us with:
- cycles per sample
- radians per sample
- sine table indices per sample
To produce the contribution of a particular harmonic to a single sample, we'll multiply the phase (cycles, radians, or sine table indices) of the HCF for that sample by the harmonic number (in terms of the HCF) of the harmonic we want to generate, extract from that product just the portion by which it exceeds the nearest multiple of 1.0 cycles, 2𝜋 radians, or the number of elements in the sine table (modulo division, or the equivalent), and translate that into an index into the sine table to retrieve a sine value.
For phase expressed in cycles, instead of using modulo division, from the product of the first step above we can simply extract the fractional portion (x - trunc(x)), multiply that by the number of elements in the sine table, and truncate that result to produce a usable index.
For phase expressed in sine table indices, modulo division by the number of elements in the sine table is necessary, but once that's done a single truncation is all that's required to produce an index for table lookup.
Phase expressed in radians has neither of these advantages. It requires both the modulo division and multiplication by a conversion factor, followed by truncation, so let's eliminate it, leaving us with just two choices — cycles or sine table indices.
It comes down to which is more expensive (in terms of cpu cycles), modulo division or an additional truncation, a subtraction, and a multiplication. That seems like a pretty easy call, modulo division is probably several times more expensive than the combination of three fast operations. This might seem trivial, but if you want to be able to generate multiple simultaneous tones, each composed of multiple secondary harmonics, 44100 times per second, wringing out those extra cpu cycles becomes important.
So, the winner is phase expressed in cycles and phase advancement in cycles per sample.
Now that we have our units nailed down, let's make another pass through the context and the process of arriving at sample values. The Anchor is like a handle, a convenient point of reference which is nominally stable with regard to the Harmonic Structure, at least between changes to that structure. The Highest Common Fundamental (HCF) is a downward projection of the structure; it cannot be higher than the lowest fundamental of a harmonic series included in the structure, and would typically be even lower, very possibly subsonic. While its position is dictated by the structure, the HCF is defined in terms of the Anchor, by means of an integer ratio, which is used to determine its frequency, in cycles per second (Hz). That frequency is then re-expressed in terms of cycles per sample (simply divide by the sample rate in samples/second), which are also appropriate units for per sample phase advancement.
Everything up to this point is only done once, unless the harmonic structure itself is edited, in which case it is done again. What we now have is an HCF defining a harmonic series which includes every member of the harmonic structure, including all of their secondary harmonics, and a rate of phase advancement for the HCF. We could at this point, calculate rates of phase advancement for each member of the structure and launch separate phase tracking for each in response to user events, but this would result in entirely random interference patterns. So, instead, we will launch phase tracking for the HCF alone, and calculate phase alignment on a per sample basis for each member of the structure which is currently participating in sound generation.
Because we have chosen to express phase (advancement) in cycles (per sample), this calculation is as simple as multiplying the phase of the HCF for the current sample by the harmonic number of the member (in the harmonic series defined by the HCF), and keeping only the fractional portion of the resulting value, the part after the decimal point (x - trunc(x)). That fractional portion of the product is multiplied by the number of elements in the sine table, and the result of that converted to an integer using a truncating initializer.
This index is then used to retrieve a sine value from the table, which is then multiplied by a volume factor (calculated separately) for that member and the current sample, and these values for each of the currently participating members are added together to produce the overall sample value.
Note that the list of currently participating members and the volume factor for each must be maintained on a per-sample (or at the very least per-callback) basis, and either copied into the real-time context or modifications based on user events saved until the real-time callback is done with them, by means of some simple locking mechanism. Even this is probably best done in C or C++, with your Swift code passing in user events by calling C/C++ functions.
For best effect, you'll probably want to either insert an equalizer downstream or incorporate the function of an equalizer into the calculation of volume factors. The latter approach seems preferable, since it removes some load from the real-time pipeline, but equalizers and UI to match are readily available plug-ins, so that might be one optimization too many, at least initially.
That's it in a nutshell, although I may have more to say about specifics as I get further into it myself.