Introduction
The Acoustic Area of Interest (AOI) system is an innovative approach to understanding the human voice by examining specific regions within the vocal tract that significantly influence the resulting sound. This article delves into the acoustics of the AOI system, focusing on three key principles: perturbation theory, impedance matching, and source-filter nonlinearity.


Perturbation Theory
Perturbation theory is a mathematical method that aids in understanding how small alterations in specific areas of the vocal tract impact the formants, the resonant frequencies that shape the voice. The AOI system divides the vocal tract into five main areas of interest (AOIs), enabling the analysis of how narrowing or widening each AOI affects the formants.
For instance, AOI 4 is a pressure node, while AOI 5 is a pressure antinode. Constricting a pressure node lowers any corresponding formants, while constricting a pressure antinode raises any corresponding formants. The inverse is true for displacement nodes and antinodes. To simplify, one can refer to the area of interest and note whether narrowing or widening that area raises or lowers any corresponding formants. Let’s compare F2 and F1 in terms of Areas of interest:
AOI | Expand | Constrict | Displacement Node/Antinode |
---|---|---|---|
1 | Lowers F2* | Raises F2* | N/A *length replaces width |
2 | Raises F2 | Lowers F2 | Displacement Antinode |
3 | Raises F2 | Lowers F2 | Displacement Node |
4 | Lowers F2 | Raises F2 | Displacement Antinode |
5 | Raises F2 | Lowers F2 | Displacement Node |
AOI | Expand | Constrict | Displacement Node/Antinode |
---|---|---|---|
1 | Lowers F1* | Lowers F1* | N/A *length replaces width |
2 | Raises F1 | Lowers F1 | Displacement Antinode |
3 | Raises F1 | Lowers F1 | Displacement Node |
4 | N/A | N/A | Displacement Antinode |
5 | N/A | N/A | Displacement Node |
Note that for practical purposes we include AOI 1, but it doesn’t affect our formants due to the effects described by perturbation theory. Instead, changing the length of AOI 1 directly changes the fundamental wavelength that is resonated by the vocal tract, in the same way that a trombone changes length to alter the fundamental it resonates.
Impedance Matching
Impedance matching is the process of optimizing energy transfer between two mediums with differing impedance. Consider a small sailboat floating in a bathtub. The most efficient way to move it is to push the solid part of the boat with your finger. Your finger and the solid part of the boat share similar impedance, ensuring effective energy transfer. However, moving the boat using only air or wind becomes challenging due to the impedance mismatch between the air and the hull. If the boat has a sail, it serves as a bridge between the low impedance air and the high impedance boat, facilitating efficient energy transfer.
Within the AOI system, impedance matching is essential in two primary areas: the closed end of the resonator, where the source interacts with the air in the vocal tract, and the open end of the resonator, where the vocal tract interacts with the surrounding environment. In the closed end, constriction enhances energy transfer, while expanding the area of interest at the open end increases energy transfer efficiency.
A practical example of impedance matching is in-ear headphones, such as in-ear monitors. These devices feature a funnel shape that narrows towards the ear canal, matching the high impedance of the headphone driver to the low impedance of the air inside the ear canal. On the contrary, earbuds like those included with iPhones and iPods have a more open design, matching impedance with the surrounding environment, making the music audible to others nearby.
The vocal tract can also be considered as matching impedance with the surrounding environment, akin to a megaphone that is narrow at the source and wide at the open end.
Source-Filter Nonlinearity
A nonlinear system is one in which the output is not directly proportional to the input. In the context of acoustics, constructive interference occurs when waves combine to create a larger wave. Researchers such as Ingo Titze, Sondhi, and Sundberg have investigated non-linear acoustics, including phenomena like flow phonation.
When the pitch is high enough (around E4), many classical singers use formant tuning to not only enhance the sound but also achieve a nonlinear interaction with the vocal folds, seemingly reducing airflow and resulting in a corresponding increase in volume and clarity. Unlike the addition of metal, formant tuning couples specific formants with specific harmonics. The two most recognizable formant-harmonic couplings are the F2-H3 coupling common in classical singers and the F1-H2 coupling in Whitney Houston’s famous song “I Will Always Love You.” This F1-H2 coupling characterizes the vocal coordination known as belting, as the F2-H3 coupling identifies the singer as an operatic tenor.
Unlike other systems that attempt to understand the voice in terms of formant tuning, the AOI system acknowledges that some areas of interest affect more than one formant simultaneously, and that the tuning of any one formant results from all the areas of interest that influence that formant. In the earlier provided chart, we saw how the second formant is affected by each AOI. Note that lengthening is listed as expansion and shortening is listed as constriction, although it doesn’t affect formants through perturbation theory. Instead, it alters the overall length of the resonator, similar to how the slide of a trombone changes the instrument’s overall length.
These changes in length place some of the areas of interest in new locations but also alter their acoustic addresses, which consistently maintain their relative distances from other areas of interest. This is due to the actual wavelengths that fit or don’t fit in the resonator we are using, which in this case is called a closed-tube resonator—closed at one end and open at the other, with its fundamental wavelength being four times as long as the physical length of the resonator. That’s why AOI 1 is unique; it’s the only area of interest that can change the fundamental frequency and the harmonics derived by multiplying that wavelength. These specific wavelengths determine the pressure or displacement nodes and antinodes’ positions.
In summary, the AOI system offers a distinctive way to understand and analyze the human voice by examining specific regions within the vocal tract and their impact on the resulting sound. By incorporating principles from acoustics, such as perturbation theory, impedance matching, and source-filter nonlinearity, we can gain valuable insights into how changes in these areas of interest affect the voice’s formants and overall sound quality.
In the following section, we will examine a practical example of an operatic tenor using the concepts discussed in the AOI system. Much of an operatic tenor’s repertoire falls within the speaking range, which is considered the modal or chest register. For simplicity, let’s consider the tenor voice below F4. In this range, it’s entirely possible to have a formant-harmonic coupling, but for most of these pitches, there is no boost in sound from doing so. This is the range in which the famous singers’ formant plays the most significant role.
However, let’s focus on tenors who use a metallic sound in this range due to the increased volume potential and enhanced intelligibility that singing with metal can provide to the operatic tenor. Singing with metal involves epilaryngeal constriction, or narrowing in AOI 3, as we have been discussing. This effect is non-fungible, so it can only be achieved with AOI 3. AOI 2 has its own non-fungible effect, in that when we sing with a rather open mouth or when the open end of the resonator is expanded, any metal in the sound is more easily perceived by the listener. Conversely, if we maintain the same vocal coordination but close the mouth slightly or round the lips, we may have a stronger perception of the same metal in our voice than what the audience experiences.
We can also observe that constricting AOI 3 and expanding AOI 2 raises the first formant and the second formant. That means that if we have a specific range in mind for the second formant, our only options left are AOIs 4 and 5 or changing the overall length of the vocal tract. But what happens when the tenor raises the pitch above F4, and the audience is expecting (likely and perhaps without knowing it) the second formant to be tuned to the third harmonic? Well, if the intention is to maintain the same metallic quality that was heard below F4, that can be achieved in more than one way. If AOI 2 is constricted slightly, then F1 and F2 will be lowered. If this lowering is enough to offset the difference between F2 and H3 such that they match, then the tenor will have an F2-H3 coupling and the resulting boost to their sound. But this same tenor could also maintain an open mouth position to allow the audience to perceive the metal in their sound while also matching the impedance of their vocal tract to the room better, as long as they can achieve an equivalent lowering of the second formant by some other means. If they can manage it, they will get not only the boost from having metal in the voice, but also the boost from having an f2-h3 coupling.
So, according to this system, the question of tongue retraction or not, often debated in classical singing circles, deserves context. It might make a lot of sense for someone who tends to sing their high notes with a very open mouth and emphasizes the metal in their sound to retract the tongue, creating constriction in AOI 4 even as expansion is created in AOI 5. They may think of this as essentially the same thing as someone else who achieves an F2-H3 coupling with a slightly more rounded mouth and consequently less tongue retraction and perhaps some tongue elevation compared to someone else who would otherwise have an F2 that does not match H3 despite having a very similar tongue position and pharyngeal shape.
In conclusion, the AOI system offers a fresh and nuanced perspective on the intricacies of the human voice, accounting for the complex interplay of factors within the vocal tract. By considering perturbation theory, impedance matching, and source-filter nonlinearity, this system enables singers and voice professionals to better understand the mechanics behind the voice and the resulting sound quality. By applying these principles, singers can experiment with different vocal tract configurations to optimize their performance, ultimately enhancing their artistry and enriching the listener’s experience.