In Chapters 10 through 12, we discussed measures of the auditory system’s sensitivity to the frequency, intensity, and temporal properties of acoustic events. This chapter deals with listeners’ subjective descriptions and evaluations of frequency and intensity. The primary discussion will cover loudness and pitch, the subjective aspects of primarily intensity and frequency. 在第 10 章到第 12 章中,我們討論了聽覺系統對聲音事件的頻率、強度和時間特性的敏感度量測。本章討論聽眾對頻率和強度的主觀描述和評估。主要的討論將涵蓋響度和音高,主要是強度和頻率的主觀方面。
LOUDNESS 音量
When sound level is varied, it is almost always the case that the sound’s loudness has changed. In order to study loudness, psychophysical procedures such as scaling and matching (see Appendix D) are used. Figure 13.1 displays the results of a loudness-matching experiment. The data are plotted as the level in dB SPL of a comparison tone, required by the listener to match (perceptually equate) the loudness of the comparison tone to that of the standard tone, as a function of the frequency of the comparison tone. The standard tone was a 1000-Hz1000-\mathrm{Hz} sinusoid presented at different levels, expressed in dB SPL. The various curves in Figure 13.1 represent different levels of the standard. For instance, for the curve labeled 40 phons the standard was a 40-dB40-\mathrm{dB}SPL,1000-Hz\mathrm{SPL}, 1000-\mathrm{Hz} tone, and the listener varied the level of 當聲級改變時,聲音的響度幾乎都會改變。為了研究響度,會使用心理物理程序,例如縮放和匹配(請參閱附錄 D)。圖 13.1 顯示了一個響度匹配實驗的結果。數據繪製成聽眾將比較音的響度與標準音的響度相匹配(在知覺上等同)所需的比較音電平(dB SPL),作為比較音頻率的函數。標準音是以不同音量(以 dB SPL 表示)呈現的 1000-Hz1000-\mathrm{Hz} 正弦波。圖 13.1 中的各條曲線代表不同等級的標準音。例如,標示為 40 phons 的曲線中,標準音為 40-dB40-\mathrm{dB}SPL,1000-Hz\mathrm{SPL}, 1000-\mathrm{Hz} 音,聽者改變 40-dB40-\mathrm{dB}SPL,1000-Hz\mathrm{SPL}, 1000-\mathrm{Hz} 音的音量,聽者就會聽到 40-dB40-\mathrm{dB}SPL,1000-Hz\mathrm{SPL}, 1000-\mathrm{Hz} 音。
the comparison tones presented at other frequencies until for each comparison tone the comparison and standard tones were perceived as equally loud. Each contour (curve) is called an equal-loudness contour because for every frequency presented at the level described by the curve all tones appear equally loud to the listener. For instance, tones presented at 100 Hz , 52 dB SPL; at 1000Hz,40dB1000 \mathrm{~Hz}, 40 \mathrm{~dB} SPL; and at 4000 Hz , 37 dB SPL are all judged to be equal in loudness, although they are different in physical level (these tones form the 40-dB40-\mathrm{dB} equal-loudness contour). 直到每個比較音調的比較音調和標準音調的音量相等為止。每條等值線(曲線)被稱為等響度等值線,因為在曲線所描述的音量下,對於每一個頻率,所有的音調對聽者來說都是同樣大的。例如,在 100 Hz、52 dB SPL、 1000Hz,40dB1000 \mathrm{~Hz}, 40 \mathrm{~dB} SPL 和 4000 Hz、37 dB SPL 呈現的音調,儘管實際音量不同,但都被判定為響度相等(這些音調構成 40-dB40-\mathrm{dB} 等響度等值線)。
Two terms are used to describe or measure the loudness of a stimulus. Loudness level is measured in phons; a phon is the level in dB SPL of an equally loud 1000-Hz tone (derived from equal-loudness contours). All tones judged equal in loudness to a 40-dB40-\mathrm{dB} SPL, 1000-Hz1000-\mathrm{Hz} tone have a loudness level of 40 phons. The tones presented at levels such that they are equal in loudness to a 70-dB70-\mathrm{dB} SPL, 1000-Hz1000-\mathrm{Hz} tone all have a loudness level of 70 phons, and so on. The equal loudness contours in Figure 13.1 form the phon scale. 有兩個詞語用來描述或測量刺激的響度。響度等級是以phons為單位測量的;phon是同樣響亮的1000Hz音調的dB SPL等級(由等響度等值線得出)。所有被判定為響度等於 40-dB40-\mathrm{dB} SPL、 1000-Hz1000-\mathrm{Hz} 的音調的響度等級都是 40 音。以等於 70-dB70-\mathrm{dB} SPL、 1000-Hz1000-\mathrm{Hz} 音調的響度等級呈現的音調的響度等級為 70 音,以此類推。圖 13.1 中的等響度等值線組成音階。
The sone scale is another way to measure loudness; 1 sone is the loudness of a 1000-Hz1000-\mathrm{Hz} tone presented at 40 dB SPL. One sone equals 40 phons. A stimulus that is nn sones loud is judged to be nn times as loud as 1 sone, that is, nn times as loud as the 1000-Hz,40-dB1000-\mathrm{Hz}, 40-\mathrm{dB} sone 標度是測量響度的另一種方法;1 sone 是 1000-Hz1000-\mathrm{Hz} 音在 40 dB SPL 下的響度。1 sone 等於 40 phons。一個 nn 響度的刺激會被判定為 nn 倍於 1 sone 的響度,也就是 nn 倍於 1000-Hz,40-dB1000-\mathrm{Hz}, 40-\mathrm{dB} 的響度。
FIGURE 13.1 Equal-loudness contours showing the level of a comparison tone required to match the perceived loudness of a 1000-Hz1000-\mathrm{Hz} standard tone presented at different levels (20, 40, 60, 80, and 100 dB SPL). Each curve is an equal-loudness contour. Based on ISO-226 standard (2003). 圖 13.1 等效音量等值線顯示了比較音調在不同音量(20、40、60、80 和 100 dB SPL)下與 1000-Hz1000-\mathrm{Hz} 標準音調的感知響度相匹配所需的音量。每條曲線都是等響度等值線。根據 ISO-226 標準 (2003)。
SPL standard. Figure 13.2a is a plot of the loudness in sones of a 1000-Hz1000-\mathrm{Hz} tone versus its level in dB SPL (at 1000Hz,dB1000 \mathrm{~Hz}, \mathrm{~dB} SPL is the same as dB in phons, and, thus, the horizontal axis could also be phons). The sone scale in this figure, like the level scale, is logarithmic. The data are fit by approximately a straight line above about 30 phons. From the slope of this line, there is a doubling of loudness for approximately every 10 dB increase in level. A listener, therefore, must increase a tone’s level by 10 dB before judging its loudness to have doubled. The sone scale was obtained by a ratio scaling technique in which the listener is asked to adjust the level of a comparison tone so that it appears half as loud or twice as loud as the 1-sone standard (1000-Hz, 40 dB SPL tone). This adjusted comparison tone becomes a new standard and a new comparison tone is presented; this procedure is repeated many times to obtain the sone scale. SPL 標準。圖 13.2a 是 1000-Hz1000-\mathrm{Hz} 音的響度(sones)與其電平(dB SPL)的對比圖(在 1000Hz,dB1000 \mathrm{~Hz}, \mathrm{~dB} SPL 與 dB 的 phons 相同,因此,橫軸也可以是 phons)。此圖中的音量刻度與電平刻度一樣,都是對數。約 30 音階以上的資料近似於一條直線。從這條直線的斜率來看,音量每增加 10 dB,響度就會增加一倍。因此,聽者必須將音調的音量增加 10 dB,才能判斷其響度增加了一倍。sone 標準是透過比率縮放技術獲得的,在此技術中,聽者被要求調整比較音調的音量,使其顯得比 1-sone 標準(1000-Hz、40 dB SPL 音調)大一半或兩倍。經調整後的比較音變成新標準,並呈現新的比較音;此程序重複多次,以獲得 sone 標度。
The 40-dB40-\mathrm{dB} equal-loudness contour is also used to calculate a measure of level called the decibel weighted by the AA scale, or dBA . The dBA measure is 40-dB40-\mathrm{dB} 等響度等值線也用於計算一種稱為分貝加權的 AA 標準,或 dBA 的水平量度。dBA 量度為
FIGURE 13.2 (a) Loudness in sones of a 1000-Hz1000-\mathrm{Hz} tone as a function of the level of the tone ( dB SPL ). The slope of the function indicates that approximately a 10-dB10-\mathrm{dB} change in level is required to double loudness. (b) Curve AA is the loudness curve from graph a, curve BB represents the perceived loudness of a 1000-Hz1000-\mathrm{Hz} tone masked by a wideband noise. Curve BB shows loudness recruitment. Based on diagrams by Steinberg and Gardner (1937). 圖 13.2 (a) 1000-Hz1000-\mathrm{Hz} 音調的響度(以音階為單位)與音調音量(dB SPL)的函數。函數的斜率顯示,大約需要 10-dB10-\mathrm{dB} 的音量變化,才能使響度加倍。(b) 曲線 AA 是圖 a 中的響度曲線,曲線 BB 代表被寬頻噪音遮蔽的 1000-Hz1000-\mathrm{Hz} 音色的感知響度。曲線 BB 表示響度招募。根據 Steinberg 和 Gardner (1937) 的圖表。
often used to measure sound level in noisy environments. The dBA measure is the total sound power, measured in decibels, that is passed through a filter with cutoffs and attenuation rates that match the 40 dB equal-loudness contour. That is, a sound is filtered by a filter that matches the 40 -phon equal-loudness contour before its total power is determined. As a result, those frequency components in the sound that are in the 500-Hz500-\mathrm{Hz} to 5000-Hz5000-\mathrm{Hz} region receive the most weight when the total noise power is computed. This 常用於測量嘈雜環境中的聲級。dBA 測量是指通過截止點和衰減速率與 40 dB 等響度等值線相匹配的濾波器所產生的總聲功率(以分貝計算)。也就是說,在確定聲音的總功率之前,聲音會先經過符合 40 dB 等響等值線的濾波器過濾。因此,在計算總雜訊功率時,聲音中位於 500-Hz500-\mathrm{Hz} 到 5000-Hz5000-\mathrm{Hz} 區域的頻率元件會獲得最大的權重。這
means that the contributions of very high and very low frequency components to the dBA measure are small. Thus, only those frequency components to which human listeners are most sensitive (see Chapter 10) contribute most to the dBA measure. 表示極高和極低的頻率成分對 dBA 值的貢獻很小。因此,只有那些人類聽眾最敏感的頻率成分(請參閱第 10 章)對 dBA 量測的貢獻最大。
Figure 13.2 b is also a plot of loudness versus level, but in this case curve BB represents the loudness of a tone that was masked by a 30-dB30-\mathrm{dB} spectrum level (N_(o))\left(N_{o}\right) wideband noise. Curve AA is the same as the one in Figure 13.2a. Notice that the threshold for the masked tone is 40 dB above the unmasked tone’s threshold. Because both tones are at threshold (one at absolute threshold, the other at masked threshold), the two tones are of equal loudness when their actual levels are 40 dB apart. When the actual level of both tones is 80 dB SPL, they are both also judged equally loud. This, in turn, means that the loudness of the masked tone (curve BB ) increased faster (steeper slope) than the unmasked tone. This increase in loudness (or the steep loudness slope) is sometimes called loudness recruitment. The loudness of the masked tone changes much more for each 10-dB10-\mathrm{dB} increase in level than does the loudness of the unmasked tone. Curve BB might also represent the data from someone whose threshold of hearing was 40 dB above the normal threshold (i.e., the person had a 40-dB40-\mathrm{dB} hearing loss) but who judges an 80-dB80-\mathrm{dB} SPL tone the same in loudness as a subject with normal hearing would. The fact that the individual’s loudness function shows loudness recruitment is important in treating certain hearing abnormalities. Note, for instance, that amplifying sound (e.g., by use of a hearing aid) will lead to a more rapid increase in loudness for the person with a hearing loss than for the person with normal hearing. 圖 13.2 b 也是響度對等級的繪圖,但在這種情況下,曲線 BB 代表被 30-dB30-\mathrm{dB} 頻譜等級 (N_(o))\left(N_{o}\right) 寬頻雜訊遮蔽的音調的響度。曲線 AA 與圖 13.2a 中的曲線相同。請注意,被遮蔽音調的臨界值比未被遮蔽音調的臨界值高出 40 dB。因為兩個音調都在閾值(一個在絕對閾值,另一個在遮罩閾值),所以當兩個音調的實際音量相差 40 dB 時,它們的音量是相等的。當兩個音調的實際音量都是 80 dB SPL 時,它們也會被判定為同樣大。反過來說,這表示遮蔽音調(曲線 BB )的響度比未遮蔽音調的響度增加得更快(斜率更陡)。這種響度的增加(或陡峭的響度斜率)有時稱為響度招募。每增加 10-dB10-\mathrm{dB} 音量時,被遮蔽音調的響度變化比未被遮蔽音調的響度變化大得多。曲線 BB 也可能代表聽力閾值比正常閾值高 40 dB 的人的資料(也就是說,此人有 40-dB40-\mathrm{dB} 聽力損失),但他對 80-dB80-\mathrm{dB} SPL 音調的響度判斷與聽力正常的受試者相同。個人的響度功能顯示響度招募的事實,對於治療某些聽力異常非常重要。請注意,舉例來說,擴大聲音(例如,使用助聽器)對於有聽力損失的人來說,會比聽力正常的人更快速地增加響度。
Most of the results that have been studied using threshold as a psychoacoustic measure can also be investigated using loudness. For instance, the duration of a tone can be varied, and the listener can adjust the tone’s power so that it remains equally loud. In so doing, loudness measures are used to determine temporal integration (see Chapter 10). A short sound will not be as loud as a long sound if their powers are equal and their durations are less than approximately 250 msec . The loudness of a sound can also be main- 大部分使用閾值作為心理聲學量度的研究結果也可以使用響度來研究。舉例來說,音調的持續時間可以改變,聽者可以調整音調的功率,使其保持同樣的響度。如此一來,響度量度就可以用來決定時間整合(見第 10 章)。如果一個短的聲音和一個長的聲音的功率相等,而且它們的持續時間小於大約 250 毫秒,那麼短的聲音就不會和長的聲音一樣大。聲音的響度也可以主
tained at a constant phon level as the bandwidth of a stimulus is narrowed in order to measure a critical band (see Chapter 11). In this case, narrowing the bandwidth of a stimulus will decrease the loudness once the bandwidth is less than a critical band. In these and other cases the data from threshold experiments are not substantially different from those obtained in loudness studies, but significant differences do exist. 為了測量臨界頻帶(請參閱第 11 章),刺激物的頻寬變窄時,音量會保持在恆定的水平。在這種情況下,一旦帶寬小於臨界帶,縮小刺激的帶寬就會降低響度。在這些和其他情況下,閾值實驗的資料與響度研究的資料沒有實質上的差異,但顯著的差異確實存在。
Loudness adaptation (or perstimulatory fatigue) occurs during exposure to a long-duration (on the order of seconds or minutes) adapting stimulus. The change in loudness adaptation takes place while the adapting stimulus is being presented. That is, a stimulus appears to become softer if it is kept on for a very long time (seconds or longer). The loudness of the stimulus is typically measured by having the listener match a comparison stimulus (usually presented to one ear) to the adapting stimulus (usually presented to the other ear) in terms of loudness. As the adapting stimulus remains on for longer and longer periods of time, listeners decrease the level of the matching stimulus in order to achieve a loudness match, indicating that the loudness of the adapting stimulus has decreased over time while it is on. Thus, the context in which a sound occurs can affect its perceived loudness. 響度適應(或過激疲勞)是在暴露於長時間(約數秒或數分鐘)的適應刺激時發生的。響度適應的變化是在適應刺激出現時發生的。也就是說,如果刺激持續很長的時間(數秒或更長),就會變得柔和。刺激的響度通常是透過讓聽者將比較刺激(通常呈現在一隻耳朵上)與適應刺激(通常呈現在另一隻耳朵上)的響度相匹配來測量的。隨著適應刺激物持續播放的時間越來越長,聽眾會降低比較刺激物的音量以達到響度匹配,這表示適應刺激物的響度隨著時間的推移而降低。因此,聲音出現的背景會影響其感知的響度。
As we mentioned in Chapter 2, although changes in level are highly correlated with loudness changes, the relationship is not perfect. That is, changes in frequency, duration, intensity, and bandwidth all affect the perceived loudness of a stimulus even though the level of the stimulus remains fixed. Remember that loudness is a subjective evaluation of sound, whereas intensity is an objective measure of vibratory magnitude or sound pressure. 正如我們在第二章中提到的,雖然音量的變化與響度的變化高度相關,但兩者的關係並不完美。也就是說,頻率、持續時間、強度和頻寬的變化都會影響刺激物的感知響度,即使刺激物的音量保持不變。請記住,響度是對聲音的主觀評估,而強度則是對振動幅度或聲壓的客觀測量。
PITCH
The pitch of sound is perceived in many auditory contexts: The melody of a song is determined by pitch changes, a female has a higher-pitched voice than a male, the perceived sound from many sound sources differ in pitch, etc. Just as sound contains no variable called loudness, it also does not contain a variable 聲音的音高是在許多聽覺情境中被感知到的:一首歌的旋律是由音高變化決定的,女性的聲音比男性的聲音音高高,從許多音源所感受到的聲音在音高上有所不同,等等。就像聲音不包含叫做響度的變數一樣,它也不包含一個變數
called pitch. Loudness and pitch are properties of perception that provide information about sound sources. Different physical attributes of sound yield the perceptions of loudness and pitch (as well as other perceptual attributes), and these perceptual attributes are a result of the auditory systems processing the neural code for these physical variables of sound (e.g., frequency, level, and time). 稱為音高。響度和音高是提供聲源資訊的感知屬性。聲音的不同物理屬性會產生響度和音高的感知(以及其他感知屬性),而這些感知屬性是聽覺系統處理這些聲音物理變數(如頻率、音量和時間)的神經代碼的結果。
Experiments on pitch are usually performed with a pitch-matching procedure in which a standard stimulus (sometimes a sinusoid) is used as the basis for pitch matches of comparison stimuli. As in a loudnessmatching experiment, the listener adjusts some acoustic aspect of a comparison sound such that it is perceived to have the same pitch as a standard sound. The standard sound is often either a sinusoidal sound or a pulse train of periodically repeating transients (see Chapter 4). The frequency of the standard sinusoid or the repetition frequency of the standard pulse train, expressed in hertz, that is judged equal in pitch to the comparison sound is used as the measure or scale of pitch in pitch-matching experiments. That is, pitch can be expressed either in terms of spectral frequency or in terms of repetition frequency. While spectral and repetition frequency are often directly related (see Chapter 4 ) in terms of pitch perception, there are cases in which they are not, as we discuss later in this chapter. 音高的實驗通常是以音高比對的程序來進行,在這個程序中,標準刺激物(有時是正弦波)會被用來作為比對刺激物音高比對的基礎。就像在響度匹配實驗中一樣,聽者調整比較聲音的某些聲學方面,使其被認為具有與標準聲音相同的音高。標準聲音通常是一個正弦波,或是一個週期性重覆的瞬間脈衝(見第四章)。標準正弦波的頻率或標準脈衝序列的重複頻率(以赫茲表示),被判定為與比較聲音的音高相等,在音高比對實驗中被用來做為音高的量度或尺度。也就是說,音高可以用頻譜頻率或重複頻率來表示。雖然在音高感知上,頻譜頻率和重複頻率通常是直接相關的(見第四章),但在某些情況下,兩者並不直接相關,這將在本章稍後討論。
Musical scales, such as the seven-tone musical scale (B C D E F G A), are often used to generate a 音樂音階,例如七聲音階 (B C D E F G A),通常用來產生
pitch scale. In general, the relationships among musical notes in a musical scale are arranged in an octave manner, with 12 intervals per octave. In the equal temperament scale, the octave is divided into 12 equal logarithmic intervals called semitones and each interval is divided into 100 equal logarithmic steps called cents. Thus, an octave has 1200 cents. A semitone is 100 cents in the equal temperament scale. These relationships are shown in Table 13.1. For this table, assume that the note A is 440 Hz . Other schemes are often used to express the physical relationship among intervals. The just intonation and the Pythagorean schemes have a slightly different number of cents between intervals (in a semitone) than the equal temperament scale. Different scales are preferred by different musicians. Thus, musical notes or cents or semitones can be used to indicate the pitch of a sound. 音階。一般而言,音階中音符之間的關係是以八度的方式排列,每個八度有 12 個音程。在平均律音階中,八度分成 12 個相等的對數音程,稱為半音,而每個音程又分成 100 個相等的對數音階,稱為音分。因此,一個八度音程有 1200 音分。在平均律音階中,一個半音為 100 分。這些關係如表 13.1 所示。在此表中,假設音符 A 為 440 Hz .通常使用其他方案來表達音程之間的物理關係。公正音調和畢達哥倫計畫的音程之間的分數(以半音為單位)與平均律音階稍有不同。不同的音樂家偏好不同的音階。因此,音符或分音或半音可用來表示聲音的音高。
Scaling procedures (see Appendix D) have been used in the measurement of pitch, but listeners often respond in a different manner when they are asked to judge pitch than when they are asked to judge loudness. This difference stems from the qualitative aspects of pitch. That is, as the pitch of a stimulus changes, it does not appear to vary along a single dimension of greater to smaller or more to less. One tone can be said to be greater in loudness than another tone, but one pitch may not be greater in magnitude than another pitch. The mel scale is a scale derived for pitch in the 標度程序(見附錄 D)已被用於音高的測量,但聽眾在被要求判斷音高時的反應,往往與被要求判斷響度時的反應不同。這種差異源自於音高的定性方面。也就是說,當一個刺激物的音高改變時,它似乎不是沿著從大到小或從多到少的單一維度變化。一種音調的響度可以說比另一種音調的響度大,但一種音調的音量不一定比另一種音調的音量大。美爾音階(mel scale)是一個在美爾音階中衍生出來的音階。
TABLE 13.1 Relationship Between Musical Note, Cents, and Frequency (Hz) for the Three Major Scales of Musical Pitch 表 13.1 三種主要音階的音符、分數和頻率 (Hz) 之間的關係
Musical note 音符
Just Intonation 正確的音調
Equal Temperament 平均律
Pythagorean Tuning 勾股定理調音
Cents 美分
Frequency 頻率
Cents 美分
Frequency 頻率
Cents 美分
Frequency 頻率
C
0
264
0
264
0
264
D
204
297
200
296
204
297
E
386
329
400
333
408
334
F
498
352
500
352
498
352
G
702
396
700
395
702
396
A
884
440
900
443
906
445
B
1088
495
1100
498
1100
501
C
1200
528
1200
528
1200
528
Musical note Just Intonation Equal Temperament Pythagorean Tuning
Cents Frequency Cents Frequency Cents Frequency
C 0 264 0 264 0 264
D 204 297 200 296 204 297
E 386 329 400 333 408 334
F 498 352 500 352 498 352
G 702 396 700 395 702 396
A 884 440 900 443 906 445
B 1088 495 1100 498 1100 501
C 1200 528 1200 528 1200 528| Musical note | Just Intonation | | Equal Temperament | | Pythagorean Tuning | |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| | Cents | Frequency | Cents | Frequency | Cents | Frequency |
| C | 0 | 264 | 0 | 264 | 0 | 264 |
| D | 204 | 297 | 200 | 296 | 204 | 297 |
| E | 386 | 329 | 400 | 333 | 408 | 334 |
| F | 498 | 352 | 500 | 352 | 498 | 352 |
| G | 702 | 396 | 700 | 395 | 702 | 396 |
| A | 884 | 440 | 900 | 443 | 906 | 445 |
| B | 1088 | 495 | 1100 | 498 | 1100 | 501 |
| C | 1200 | 528 | 1200 | 528 | 1200 | 528 |
same way the sone scale was obtained for loudness, by using direct scaling techniques. However, the mel scale is difficult to obtain, probably due to the qualitative nature of the subjective dimension of pitch. 同樣地,音階也是使用直接比例技術來獲得的。然而,可能由於音高的主觀維度具有定性的特性,因此很難獲得 mel 標度。
In discussing the threshold of audibility, we mentioned that listeners might be asked to detect the presence of a tonal-sounding stimulus instead of just detecting any sound. Tonality implies that the observer is detecting the presence of pitch. Von Bekesy found that a tone with a frequency that was less than 1000 Hz must have a duration equal to 3 to 9 periods if the tone was to have a definite pitch. Above 1000 Hz this critical duration for the perception of tonality, or pitch, was 10 msec regardless of the frequency of the tone. 在討論可聽性閾值時,我們提到聽眾可能會被要求偵測到音調刺激的存在,而不只是偵測到任何聲音。音調意味著觀察者正在偵測音高的存在。Von Bekesy 發現,如果一個頻率小於 1000 Hz 的音調要有確定的音高,其持續時間必須等於 3 到 9 個週期。在 1000 Hz 以上,無論音調的頻率為何,感知音調或音高的臨界持續時間為 10 毫秒。
Complex Pitch 複合間距
There is a strong correlation between the pitch of a stimulus and the spectral location of its frequencies. That is, if a complex sound has a spectral structure that can be resolved by the auditory system, then aspects of this spectral structure (e.g., the spectral region with the greatest level) is often highly correlated with the sound’s perceived pitch. Recall from Chapters 7-9 and 11 that the auditory system is limited in its ability to resolve spectral differences, in that small frequency differences are resolvable in the low-frequency region of the spectrum but not in the high frequencies. Thus, small spectral differences at high frequencies are unlikely to contribute to pitch perception. 刺激物的音高與其頻率的頻譜位置有很強的相關性。也就是說,如果一個複雜的聲音具有聽覺系統可以分辨的頻譜結構,那麼這個頻譜結構的某些方面(例如,具有最大音階的頻譜區域)通常與聲音的感知音高高度相關。回想一下第 7-9 章和第 11 章,聽覺系統分辨頻譜差異的能力是有限的,因為在頻譜的低頻區域,小的頻率差異是可以分辨的,但在高頻就不行了。因此,高頻的微小頻譜差異不太可能對音高感知有幫助。
Listeners can perceive a pitch for complex sounds that do not have any spectral components at the perceived pitch. For instance, listeners report a 100-Hz100-\mathrm{Hz} pitch associated with a stimulus consisting of a sum of the frequencies of 700,800,900700,800,900, and 1000 Hz . Although there is absolutely no energy at 100 Hz , listeners judge the sound to have a 100-Hz100-\mathrm{Hz} pitch. These four tones are all harmonics of 100 Hz (in fact, 100 Hz is the highest frequency for which the tones could be harmonically related). The observation that a pitch could be associated with the fundamental of a complex stimulus even when the fundamental was absent in the spectrum of the complex stimulus is called the case of 聽者可以感知到複雜聲音的音高,這些複雜聲音在感知到的音高上沒有任何頻譜成分。例如,聽眾報告一個 100-Hz100-\mathrm{Hz} 的音高與一個由 700,800,900700,800,900 和 1000 Hz 的頻率總和組成的刺激相關聯。雖然在 100 Hz 時完全沒有能量,聽眾判斷聲音具有 100-Hz100-\mathrm{Hz} 的音高。這四個音調都是 100 Hz 的諧波(事實上,100 Hz 是這些音調能產生諧波關係的最高頻率)。即使在複雜刺激的頻譜中沒有基音,音高也可以與該刺激的基音相關聯,這種觀察被稱為
FIGURE 13.3 The amplitude spectrum (a) and the time-domain waveform (b) of a missing-fundamental pitch stimulus consisting of a complex sound with frequency components of 700,800,900700,800,900, and 1000 Hz . The pitch is 100 Hz , which is the missing fundamental (dotted) component in panel (a). The time domain has an envelope with a 10-msec10-\mathrm{msec} period, the reciprocal of 1000 Hz . 圖 13.3 缺失基音音高刺激的振幅頻譜(a)和時域波形(b),包含頻率成分為 700,800,900700,800,900 和 1000 Hz 的複雜聲音。音高是 100 Hz,也就是面板 (a) 中缺失的基音(虛線)成分。時域有一個週期為 10-msec10-\mathrm{msec} 的包络,是 1000 Hz 的倒數。
the missing fundamental or missing fundamental pitch. 缺失的基音或缺失的基音音高。
The amplitude spectrum and the time waveform (assuming all the tones had the same phase) associated with the tonal complex ( 700,800,900700,800,900, and 1000 Hz ) are shown in Figure 13.3. Notice that there is a 10-msec10-\mathrm{msec} spacing between the major peaks in the complex time waveform. Since the frequency associated with a 10-msec10-\mathrm{msec} period is 100 Hz , it may be possible that the auditory system perceived the 100-Hz100-\mathrm{Hz} pitch because of the 10-msec10-\mathrm{msec} periodicity of the time waveform. In fact, most stimuli that have a periodic time waveform will have a perceived pitch equal to 圖 13.3 顯示了與複音 ( 700,800,900700,800,900 和 1000 Hz) 相關的振幅頻譜和時間波形 (假設所有音調的相位相同)。請注意,複音時間波形的主要峰值之間有 10-msec10-\mathrm{msec} 的間距。由於與 10-msec10-\mathrm{msec} 周期相關的頻率是 100 Hz ,聽覺系統可能因為時間波形的 10-msec10-\mathrm{msec} 周期而感知到 100-Hz100-\mathrm{Hz} 音高。事實上,大多數具有週期性時間波形的刺激,其感知到的音高會等於
NONLINEAR TONES 非線性色調
In Chapter 11, we described the aural harmonics and difference tones produced by the nonlinearity of the auditory system. The first and second aural harmonics ( 2f_(1)2 f_{1} and 3f_(1)3 f_{1} ), difference tone ( f_(1)-f_(2)f_{1}-f_{2} ), and a cubicdifference tone (2f_(1)-f_(2):}\left(2 f_{1}-f_{2}\right. ) are those nonlinear tones most often perceived (see Chapter 5 and Appendix A for a discussion of nonlinearity). That is, when a complex sound consisting of a number of sinusoids is presented to a listener, especially at loud levels, listeners hear pitches in addition to those corresponding to the frequency of the sinusoids in the stimulus. These additional pitches are associated with the aural harmonics and difference tones produced by the nonlinear properties of the auditory periphery (see Chapters 8 and 9). The cubic-difference tone is of particular interest because in many conditions it is the most perceptible nonlinearly produced tone. For instance, if 1400-Hz1400-\mathrm{Hz} and 1680-Hz1680-\mathrm{Hz} primary tones are summed, a cubic-difference tone of 1120 Hz is perceived; that is, 2xx1400Hz-1680Hz=2 \times 1400 \mathrm{~Hz}-1680 \mathrm{~Hz}= 1120 Hz . This cubic-difference tone can be heard when the levels of the 1400-Hz1400-\mathrm{Hz} and 1680-Hz1680-\mathrm{Hz} primary tones are less than 40 dB SL, whereas the difference tone of 280Hz(1680Hz-1400Hz=280Hz)280 \mathrm{~Hz}(1680 \mathrm{~Hz}-1400 \mathrm{~Hz}=280 \mathrm{~Hz}) cannot be detected at these low primary tone levels. Remember that the perception of the 1120-Hz1120-\mathrm{Hz} or the 280-Hz280-\mathrm{Hz} pitch is not due to the presence of these frequencies in the stimulus. The pitches result from the nonlinear distortion caused by the peripheral auditory system. 在第 11 章中,我們描述了由於聽覺系統的非線性而產生的聽覺和聲和差異音。第一和第二次聽覺諧波 ( 2f_(1)2 f_{1} 和 3f_(1)3 f_{1} )、差異音 ( f_(1)-f_(2)f_{1}-f_{2} ) 和立方差異音 (2f_(1)-f_(2):}\left(2 f_{1}-f_{2}\right. ) 是最常被感知到的非線性音調 (有關非線性的討論,請參閱第五章和附錄 A)。也就是說,當聽眾聽到一個由許多正弦波組成的複雜聲音,特別是在大音量時,聽眾除了聽到與刺激物中正弦波頻率相對應的音調之外,還會聽到其他的音調。這些額外的音高與聽覺週邊的非線性特性所產生的和聲和差異音有關(請參閱第 8 章和第 9 章)。立方差分音是特別令人感興趣的,因為在許多情況下,它是最容易被感知到的非線性產生的音調。例如,如果 1400-Hz1400-\mathrm{Hz} 和 1680-Hz1680-\mathrm{Hz} 原音相加,就可以感知到 1120 Hz 的立方差音;也就是 2xx1400Hz-1680Hz=2 \times 1400 \mathrm{~Hz}-1680 \mathrm{~Hz}= 1120 Hz 。當 1400-Hz1400-\mathrm{Hz} 和 1680-Hz1680-\mathrm{Hz} 主音的音量小於 40 dB SL 時,就可以聽到這個立方差分音,而 280Hz(1680Hz-1400Hz=280Hz)280 \mathrm{~Hz}(1680 \mathrm{~Hz}-1400 \mathrm{~Hz}=280 \mathrm{~Hz}) 的差分音在這些低的主音音量下是偵測不到的。請記住,對 1120-Hz1120-\mathrm{Hz} 或 280-Hz280-\mathrm{Hz} 音高的感知並不是因為刺激中存在這些頻率。這些音高是由於外周聽覺系統造成的非線性失真而產生的。
In order to completely describe the nonlinear tones, we must specify their levels and phases as well as their frequencies. The cancellation method is often used to obtain estimates of the level and phase of nonlinear tones. In the cancellation method one complex stimulus is used to elicit the nonlinear tone, for instance, an 840-Hz840-\mathrm{Hz} and a 1000-Hz1000-\mathrm{Hz} primary tone pair that produces a 680-Hz680-\mathrm{Hz} cubic-difference tone; that is, 2xx840Hz-2 \times 840 \mathrm{~Hz}-1000Hz=680Hz1000 \mathrm{~Hz}=680 \mathrm{~Hz}. Another stimulus is presented along with the primaries and is used to cancel the pitch of the nonlinear tone, in this case a 680-Hz680-\mathrm{Hz} cancellation tone. That is, a 680-Hz680-\mathrm{Hz} cancellation tone is added to the 840-Hz840-\mathrm{Hz} and 1000-Hz1000-\mathrm{Hz} primary tones. Without the addition of the 680-Hz680-\mathrm{Hz} cancellation tone, listeners can detect sound with pitches of 840 and 1000 Hz (the 為了完整描述非線性音調,我們必須指定其電平和相位以及頻率。取消法通常用來取得非線性音調的電平與相位的估計值。在消除法中,一個複雜的刺激被用來引發非線性音調,例如,一個 840-Hz840-\mathrm{Hz} 和一個 1000-Hz1000-\mathrm{Hz} 主音對,產生一個 680-Hz680-\mathrm{Hz} 立方差異音調;也就是 2xx840Hz-2 \times 840 \mathrm{~Hz}-1000Hz=680Hz1000 \mathrm{~Hz}=680 \mathrm{~Hz} 。另一個刺激與主音一起呈現,用來消除非線性音調的音高,在這個例子中是 680-Hz680-\mathrm{Hz} 取消音調。也就是說,一個 680-Hz680-\mathrm{Hz} 取消音被加到 840-Hz840-\mathrm{Hz} 和 1000-Hz1000-\mathrm{Hz} 主音上。如果不加上 680-Hz680-\mathrm{Hz} 消減音,聽眾就可以偵測到音高為 840 和 1000 Hz 的聲音(the
primary tones) and 680 Hz (the nonlinearly produced cubic-difference tone). If the 680-Hz680-\mathrm{Hz} cancellation tone is presented 180^(@)180^{\circ} out of phase with that of the nonlinear 680-Hz680-\mathrm{Hz} cubic-difference tone, then the 680-Hz680-\mathrm{Hz} pitch might be canceled, and the listener would detect only the two primary tones. This cancellation should only occur when the two tones (cancellation tone and cubic-difference tone) are at equal levels and 180^(@)180^{\circ} out of phase (i.e., the addition of two tones of the same frequency and level but 180^(@)180^{\circ} out of phase will lead to complete cancellation). In the cancellation procedure, the listener is presented the 840-Hz840-\mathrm{Hz} and 1000-Hz1000-\mathrm{Hz} primary tones (tones used to elicit the 680-Hz680-\mathrm{Hz} cubicdifference tone) and the 680-Hz680-\mathrm{Hz} cancellation tone. The listener is instructed to adjust the level and phase of the 680-Hz680-\mathrm{Hz} cancellation tone until the listener no longer hears a pitch of 680 Hz (the cubic-difference tone). The level of the cancellation tone and 180^(@)180^{\circ} minus the phase of the cancellation tone that the listener picked that eliminated the pitch of the cubicdifference tone is used to estimate the level and phase of the cubic-difference tone (or other nonlinear tones). 主音)和 680 Hz(非線性產生的立方差分音)。如果 680-Hz680-\mathrm{Hz} 消除音與非線性 680-Hz680-\mathrm{Hz} 立方差分音的 180^(@)180^{\circ} 相位不同,那麼 680-Hz680-\mathrm{Hz} 音高可能會被消除,聽者只會偵測到兩個主音。只有當兩個音調 (取消音調和立方差分音調) 的音量相等,而且 180^(@)180^{\circ} 不在同一相位時,這種取消才會發生 (也就是說,加上兩個頻率和音量相同,但是 180^(@)180^{\circ} 不在同一相位的音調,會導致完全取消)。在消除過程中,聽者會聽到 840-Hz840-\mathrm{Hz} 和 1000-Hz1000-\mathrm{Hz} 主要音調(用於引發 680-Hz680-\mathrm{Hz} 立方差異音調的音調)以及 680-Hz680-\mathrm{Hz} 消除音調。聽者被指示調整 680-Hz680-\mathrm{Hz} 取消音的音量和相位,直到聽者不再聽到 680 Hz 的音高(立方差音)。取消音的音量和 180^(@)180^{\circ} 減去聽眾選取的消除立方差分音音高的取消音的相位,用來估計立方差分音(或其他非線性音調)的音量和相位。
Figure 13.6 displays the results from a cancellation experiment. The upper curve shows the level of the 680-Hz680-\mathrm{Hz} tone required to cancel the cubic-difference tone as a function of the overall level of the two primary tones ( 840 and 1000 Hz ) used to elicit the cubic-difference tone. The lower curve shows the phase (minus 180^(@)180^{\circ} ) of the 680-Hz680-\mathrm{Hz} tone required to cancel the cubic-difference tone. 圖 13.6 顯示了消除實驗的結果。上部曲線顯示了消除立方差分音所需的 680-Hz680-\mathrm{Hz} 音調的音量,它是用來誘發立方差分音的兩個主要音調(840 和 1000 Hz)的總音量的函數。下部曲線顯示消除立方差分音所需的 680-Hz680-\mathrm{Hz} 音的相位(減 180^(@)180^{\circ} )。
The intensities and phases of the nonlinearities of the auditory system are crucial values to be determined if we are to describe how the system produces these nonlinearities. The cubic-difference tone has been of special interest because it appears at low levels, and the level and phase of the cubic-difference tone change in a complex way as a function of its pitch (that is, as a function of separation in frequency between f_(1)f_{1} and f_(2)f_{2} ). Because of these relations, most scientists believe that the source of the cubicdifference tone is in the inner ear; the perceived pitch of the cubic-difference tone probably results either from the nonlinear motion of the basilar membrane or from some nonlinearities that exist when the hair cells stimulate auditory nerve fibers. 如果我們要描述聽覺系統如何產生這些非線性,那麼聽覺系統非線性的強度和相位是必須確定的關鍵值。立方差分音一直以來都特別受人關注,因為它出現的音量很低,而且立方差分音的音量和相位會隨著音高(也就是 f_(1)f_{1} 和 f_(2)f_{2} 之間的頻率距離)的變化而發生複雜的變化。由於這些關係,大多數科學家相信立方差音的來源在內耳;立方差音的音高可能來自於基底膜的非線性運動,或是毛細胞刺激聽覺神經纖維時的一些非線性現象。
FIGURE 13.6 (a) The level of a cancellation tone (680Hz(680 \mathrm{~Hz}, with its level expressed relative to the levels of the primaries) required to cancel a 680-Hz680-\mathrm{Hz} cubic-difference tone produced by 840-840- and 1000-Hz1000-\mathrm{Hz} primaries presented at different levels (dB SPL). The cancellation-tone level is about 25 to 28 dB below that of the primaries for primary levels ranging from 45 to 75 dB SPL. (b) The phase of the cancellation tone required to cancel the 680-Hz680-\mathrm{Hz} cubicdifference tone shown as a function of the level of the primaries. The estimated phase of the cubic-difference tone is 180^(@)180^{\circ} minus the values shown in curve (b). Based on data from J. L. Hall (1975), used with permission. 圖 13.6 (a) 消除音 (680Hz(680 \mathrm{~Hz} ,其電平相對於主音的電平來表示)所需的電平,以消除由 840-840- 和 1000-Hz1000-\mathrm{Hz} 在不同電平 (dB SPL) 下呈現的主音所產生的 680-Hz680-\mathrm{Hz} 立方差分音。當主音量在 45 到 75 dB SPL 之間時,消除音音量比主音量低約 25 到 28 dB。(b) 消除 680-Hz680-\mathrm{Hz} 立方差分音所需的消除音的相位顯示為主音電平的函數。立方差分音的估計相位為 180^(@)180^{\circ} 減去曲線 (b) 所示的值。根據 J. L. Hall (1975) 的資料,經授權使用。
OTHER SUBJECTIVE ATTRIBUTES OF SOUND 聲音的其他主觀屬性
Complex stimuli have subjective attributes in addition to pitch and loudness, one of which is timbre. Timbre is often defined as that subjective attribute of a sound that differentiates two or more sounds that have the same pitch, loudness, and duration. For 除了音高和響度之外,複雜的刺激物還有主觀屬性,其中之一就是音色。音色通常被定義為聲音的主觀屬性,用以區分具有相同音高、響度和持續時間的兩種或多種聲音。對於
instance, the quality difference between a violin and a cello playing the same musical note at the same loudness and for the same duration would be defined as a difference in timbre between the two instruments. Timbre appears to be related to the bandwidth of the complex stimulus, especially for complex waveforms consisting of harmonically related sinusoids. A stimulus with a larger number of harmonics is usually perceived as having a fuller or richer timbre than more narrowband stimuli, such as a pure tone. 舉例來說,小提琴和大提琴以相同的音量和持續時間演奏相同的音符,兩者之間的音質差異可以定義為兩種樂器之間的音色差異。音色似乎與複雜刺激的頻寬有關,尤其是由和聲相關的正弦波組成的複雜波形。具有較多諧波的刺激物通常會比較窄頻帶的刺激物(例如純音)被認為具有較豐滿或豐富的音色。
In addition to timbre, musicians often refer to the consonance and dissonance of complex stimuli, such as notes of music. Consonant pairs are those notes played at intervals that result in a “pleasant” sound. Dissonant pairs are played at intervals that sound “unpleasant” to many musicians. We have already discussed (Chapter 10) the beats associated with mixing two sinusoids whose frequencies differ by a few hertz. The sensation of beats gives way to flutter and then to roughness as the frequency difference between the two tones is increased. 除了音色之外,音樂家經常提到複雜刺激物的共鳴與不協調,例如音樂的音符。共鳴音對是指那些以能產生「愉快」聲音的間隔來演奏的音符。對許多音樂家來說,不協調的音符對是以聽起來「不悅耳」的間隔來演奏的。我們已經討論過(第 10 章)將兩個頻率相差幾赫茲的正弦波混合所產生的節拍。當兩個音調之間的頻率差增加時,拍子的感覺就會讓位給顫音,然後變得粗糙。
S. S. Stevens determined that sounds also have attributes of density and volume. The density of a tone increases as its frequency or intensity increases. Increases in volume are generally associated with decreases in frequency and level. Thus, volume and density are approximate opposites. The fact that different sounds elicit different subjective descriptions is not surprising. Many scientists believe that the labeling of the subjective dimensions of sound may also depend on culture and experience. For instance, many modern composers are writing music with dissonant intervals. Because some of this music is becoming popular, perhaps which musical intervals are labeled as consonant and dissonant might change over time. S.S. Stevens 斷定聲音也有密度和音量的屬性。音調的密度會隨著其頻率或強度的增加而增加。音量的增加通常與頻率和音級的降低有關。因此,音量和密度是近似對立的。事實上,不同的聲音會引起不同的主觀描述,這並不奇怪。許多科學家認為,聲音主觀層面的標示也可能取決於文化和經驗。舉例來說,許多現代作曲家都在創作音程不協調的音樂。由於有些音樂開始流行,也許哪些音程被標示為輔音和不協調音程可能會隨著時間而改變。
SUMMARY 摘要
From equal-loudness contours and loudnessscaling experiments, we can construct the phon and sone scales of loudness. These scales enable us to relate the subjective description of loudness to the physical descriptions of frequency and intensity. Pitch is often measured in a pitch-matching task. 從等響度等值線和響度比例實驗中,我們可以建構出響度的 phon 和 sone 標準。這些量表讓我們能夠將主觀的響度描述與頻率和強度的物理描述相聯繫。音高通常在音高匹配任務中測量。
The musical scale of pitch contains octaves, intervals, semitones, and cents. The mel scale of pitch is constructed from a pitch-scaling experiment. Scales of pitch are more qualitative in nature than are loudness scales. Pitch and loudness are not perfectly correlated with their physical counterparts: frequency and intensity. The concept of the missing fundamental illustrates that pitch processing is sometimes dependent neither on spectral information at the frequency of the pitch nor on envelope periodicity information associated with the period of the pitch. Edge pitch, dichotic pitch, resolved pitch, and the pitch of mistuned harmonics are all examples of the variety of situations in which complex sounds produce pitch. The levels and phases of nonlinear tones (especially the cubicdifference tone) can be measured by the cancellation technique. Complex stimuli have additional subjective attributes, such as timbre, consonance, dissonance, beats, flutter, roughness, density, and volume. 音樂的音階包含八度、音程、半音和分音。梅爾音階是由音高刻度實驗所建立的。音階在本質上比響度音階更為定性。音高和音量與它們的物理對應:頻率和強度並非完全相關。缺失基音的概念說明,音高處理有時既不取決於音高頻率的頻譜資訊,也不取決於與音高週期相關的包膜週期性資訊。邊緣音高 (Edge pitch)、二分音高 (Dichotic pitch)、解析音高 (resolved pitch) 和失調諧波的音高 (pitch of mistuned harmonics) 都是複雜聲音產生音高的各種情況的例子。非線性音調(特別是立方差分音調)的音級和相位可以用取消技術來測量。複雜的刺激物有額外的主觀屬性,例如音色、共鳴、不協調、節奏、飄揚、粗糙度、密度和音量。
SUPPLEMENT 補充
Loudness and pitch as subjective attributes of sound have been studied extensively by S. S. Stevens. His book Psychophysics (1975) and his article in Science (1970), 'Neural Events and the Psychophysical Law," provide an insight into his work. Fletcher and Munson (1933) used the loudness-matching technique to obtain the equal-loudness contours. The book by Hartmann (1998) should also be consulted for more details about pitch, loudness, and nonlinearities. The book by B. C. J. Moore (1997) also covers topics on loudness and pitch. Plack et al. (2005) provide a thorough review of pitch. 作為聲音的主觀屬性,S. S. Stevens 對響度和音高進行了廣泛的研究。他的著作《心理物理》(Psychophysics,1975 年)和他在《科學》(Science,1970 年)上發表的文章《神經事件與心理物理定律》,讓我們對他的工作有了深入的了解。Fletcher 和 Munson (1933) 使用響度匹配技術來獲得等響度等值線。如需更多關於音高、響度和非線性的詳細資料,請參閱 Hartmann (1998) 的著作。B. C. J. Moore (1997) 的著作也涵蓋了有關響度和音高的主題。Plack 等人 (2005) 對音高做了詳盡的評論。
Loudness is often measured in an alternating binaural loudness balance (ABLB) technique. In this procedure, the standard tone is presented to one ear and the comparison tone to the other ear. The tones are alternated in time, and the listener adjusts the comparison tone until it appears as loud as the standard. Steinberg and Gardner (1937) provided insights about 響度通常以交替雙耳響度平衡 (ABLB) 技術進行測量。在此程序中,一隻耳朵聽標準音,另一隻耳朵聽對比音。音調在時間上交替,聽者調整比較音調,直到它看起來和標準音調一樣大。Steinberg 和 Gardner (1937) 提供了關於
the relationship between masking and loudness that led to the concept of recruitment. Buus, Florentine, and Poulsen (1999) should also be consulted for a view of loudness and loudness recruitment. 響度的概念。Buus、Florentine 和 Poulsen (1999) 對於響度和響度招募的看法也應該參考。
S. S. Stevens devised a method that has become standard for determining loudness of complex, nontonal sounds (ISO standard 532 (1975)—Method A; see the supplements to Chapter 10 and the section on Standards at the end of the References, following the appendixes). This method involves combining the sone measurements for various frequency bands. Another method, devised by Zwicker (ISO standard 532 (1975)—Method B), has also been used to measure the subjective magnitude of a complex stimulus. In general, critical bandwidths estimated using loudness methods are three to four times wider than those obtained from masking studies (see Chapter 11). S.S.Stevens設計了一個方法,這個方法已經成為判定複雜、非音調聲音的大聲度的標準(ISO 標準 532 (1975)-方法 A;請參閱第 10 章的補充和參考文獻末尾的標準部分,在附錄之後)。此方法是結合不同頻段的音量測量。Zwicker 設計的另一種方法(ISO 標準 532 (1975)-方法 B)也被用來量測複雜刺激的主觀幅度。一般而言,使用響度方法估算的臨界頻寬比掩蔽研究得到的臨界頻寬要寬三到四倍(見第 11 章)。
The book by Plack et al. (2005) should be consulted for a history of the study of pitch perception. The student might have noted for the missing fundamental pitch (in Figure 13.3, where the frequencies added together were 700,800,900700,800,900, and 1000 Hz ) that a nonlinear difference tone exists at 100 Hz . Thus, the 100-100- Hz pitch might be due to the nonlinearity of the ear. This, however, does not seem to be the case. Licklider (1954), for instance, showed that the difference tone due to nonlinearity could be masked by a noise with frequencies near 100 Hz . Licklider then showed, however, that the pitch of the missing-fundamental stimulus was unaffected by a masking noise with a frequency of 100 Hz . Since the nonlinear tone at 100 Hz was masked and the pitch was still perceived, nonlinearity did not yield the missing fundamental pitch. 關於音高感知的研究歷史,應該參考 Plack 等人 (2005) 的著作。學生可能已經注意到,對於缺失的基音音高(在圖 13.3 中,頻率相加為 700,800,900700,800,900 和 1000 Hz),在 100 Hz 處存在一個非線性的差異音。因此, 100-100- Hz 的音高可能是由於耳朵的非線性造成的。然而,情況似乎並非如此。舉例來說,Licklider (1954)顯示,由於非線性而產生的差異音調可以被頻率接近100 Hz的噪音所掩蓋。然而,Licklider 接著指出,缺失基音刺激的音高不受頻率為 100 Hz 的掩蔽雜訊影響。由於 100 Hz 的非線性音調被遮蔽,音高仍被感知到,因此非線性並沒有產生缺失的基音音高。
There is considerable evidence that the pitch of a tone changes (usually increases) as its level increases. Although Stevens (1975) studied this effect extensively, the pitch changes are highly variable and usually fairly small. Jesteadt (1980) provides an interesting procedure for determining pitch shifts associated with intensity changes. Moore et al. (1985) and Hartmann et al. (1990) have studied the pitch of mistuned harmonics. Hartmann and Zhang (2003) can be consulted for more about both edge pitch and dichotic pitch (also see Bilsen and Raatgever, 2000, for 有相當多的證據顯示,音調的音高會隨著音量的增加而改變(通常是增加)。雖然 Stevens (1975) 對這種效果進行了廣泛的研究,但音高的變化變化很大,而且通常相當小。Jesteadt (1980) 提供了一個有趣的程序來確定與強度變化相關的音高變化。Moore 等人 (1985) 和 Hartmann 等人 (1990) 研究了失調諧波的音高。有關邊緣音高和二分音高的更多資訊,請參閱 Hartmann 和 Zhang (2003)。
dichotic pitch). The chapter by de Cheveigne (2005) should be consulted for a review of models and theories of pitch. 二分音高)。有關音高模型與理論的回顧,請參閱 de Cheveigne (2005) 的章節。
A few people, without a reference, can produce an exact pitch or recognize a sound as having an exact pitch. This ability is often referred to as absolute pitch. A more common form of pitch performance is relative pitch, in which some listeners can identify the musical interval between two musical notes. Relative pitch is found among many musicians. Ross, Gore, and Marks (2003) provide an interesting theory of absolute pitch. 少數人在沒有參考物的情況下,可以產生準確的音高,或識別出具有準確音高的聲音。這種能力通常被稱為絕對音高。更常見的音高表現形式是相對音高,有些聽眾可以辨別出兩個音符之間的音程。許多音樂家都有相對音高的表現。Ross, Gore, and Marks (2003) 提供了一個有趣的絕對音高理論。
There is evidence that we hear primarily the following aural harmonics and combination tones: 2f,3f2 f, 3 f, and 4f,f_(1)-f_(2),2f_(1)-f_(2),2f_(1)-2f_(2),3f_(1)-2f_(2)4 f, f_{1}-f_{2}, 2 f_{1}-f_{2}, 2 f_{1}-2 f_{2}, 3 f_{1}-2 f_{2}. Summation tones have generally not been reported as audible. This is presumably because low-frequency tones mask high-frequency tones very well (upper spread of masking; see Chapter 11). Because summation tones are higher in frequency than the primary tones but also usually close to the frequency of one of the primaries, the summation tones are masked by the primary tones. 有證據顯示,我們主要聽到下列聽覺和聲和組合音調: 2f,3f2 f, 3 f 和 4f,f_(1)-f_(2),2f_(1)-f_(2),2f_(1)-2f_(2),3f_(1)-2f_(2)4 f, f_{1}-f_{2}, 2 f_{1}-f_{2}, 2 f_{1}-2 f_{2}, 3 f_{1}-2 f_{2} 。一般來說,總和音並不是我們可以聽到的。這大概是因為低頻音調可以很好地遮蔽高頻音調(遮蔽的上擴散;請參閱第 11 章)。因為總和音調的頻率比主音調高,但通常也接近其中一個主音調的頻率,所以總和音調會被主音調遮蔽。
The cancellation method is usually used with one additional tone added to the input stimulus. Because the pitch of the nonlinear tone is often difficult to detect, it can be made easier to perceive if a tone is added that is slightly different in frequency than the cancellation tone (thus, the input stimulus consists of the primaries, the cancellation tone, and a tone that is slightly different in frequency, e.g., 3 Hz , from the cancellation tone). This additional tone and the cancellation tone (and the nonlinear tone) will produce a beating sensation. If the cancellation tone is now 消除法通常是在輸入刺激中加入一個額外的音調。由於非線性音調的音高通常很難偵測到,如果再加上一個頻率與取消音調略有不同的音調,就可以讓人更容易感受到非線性音調的音高(因此,輸入刺激包含主音、取消音調,以及一個頻率與取消音調略有不同的音調,例如 3 Hz)。這個額外的音調和取消音調(以及非線性音調)會產生跳動感。如果取消音現在是
added out of phase and at the same level as the nonlinear tone so that the nonlinear and cancellation tones are eliminated, then the beating stops and the listener hears the pitches of two primary tones and the tone used to beat with the nonlinear tone. Most listeners find it easier to make the cancellation procedure measurements when they are asked to eliminate the beating rather then to eliminate the pitch of the nonlinear tone. 在非線性音調的相位之外,加上與非線性音調相同音量的音調,使非線性音調和消除音調消除,然後停止跳動,聽眾聽到兩個主要音調的音高,以及用來與非線性音調一起跳動的音調。大多數聽眾發現,當他們被要求消除跳動而不是消除非線性音調的音高時,他們更容易進行消除程序的測量。
A review of the cubic-difference tone can be found in Zwicker and Fastl’s book (1991). Appendix A shows how one could obtain difference tones and summation tones from a nonlinear equation (y=x+x^(2))\left(y=x+x^{2}\right). The cubic-difference tone (2f_(1)-f_(2))\left(2 f_{1}-f_{2}\right) is obtained if the nonlinear equation is in the form y=x+x^(2)+x^(3)y=x+x^{2}+x^{3}. Thus, the name cubic-difference tone occurs because the tone is obtained by including the cubic ( x^(3)x^{3} ) term in the nonlinear equation. Also, as mentioned in Chapter 8, the cubic-difference tones is important for measuring cochlear emissions, especially the distortion product otoacoustic emission (DPOAE). 立方差分音的回顧可以在 Zwicker 和 Fastl 的著作 (1991) 中找到。附錄 A 顯示了如何從非線性方程式 (y=x+x^(2))\left(y=x+x^{2}\right) 得到差分音調和求和音調。如果非線性方程式的形式是 y=x+x^(2)+x^(3)y=x+x^{2}+x^{3} ,就可以得到立方差分音 (2f_(1)-f_(2))\left(2 f_{1}-f_{2}\right) 。因此,之所以稱為立方差分音調,是因為該音調是透過在非線性方程式中加入立方 ( x^(3)x^{3} ) 項而得到的。此外,如第 8 章所述,立方差分音調對於測量耳蝸發射,尤其是失真產物耳聲發射 (DPOAE) 非常重要。
The stimulus shown in Figure 13.6 is a regular interval stimulus (RIS) called iterated rippled noise (IRN). It can be generated by delaying a noise and adding the delayed noise back to the undelayed noise. The perceived pitch of IRN stimuli is equal to the reciprocal of the delay (Yost et al., 1996). Thus, in Figure 13.6 the delay was 4 msec , yielding the 250-Hz250-\mathrm{Hz} pitch (4msec=1//250Hz)(4 \mathrm{msec}=1 / 250 \mathrm{~Hz}) and 250-Hz250-\mathrm{Hz} spacing between the noisy spectral peaks. Research involving IRN stimuli suggests that the temporal fine structure of complex sounds plays a role in complex-pitch processing (see Yost et al., 1996). 圖 13.6 所示的刺激是一種規則間隔刺激 (RIS),稱為迭代波紋雜訊 (IRN)。它可以通過延遲一個噪音並將延遲的噪音加回未延遲的噪音來產生。IRN 刺激的感知音高等於延遲的倒數(Yost 等人,1996)。因此,在圖 13.6 中,延遲為 4 毫秒,產生 250-Hz250-\mathrm{Hz} 音高 (4msec=1//250Hz)(4 \mathrm{msec}=1 / 250 \mathrm{~Hz}) 和噪音頻譜峰值之間的 250-Hz250-\mathrm{Hz} 間距。涉及 IRN 刺激的研究顯示,複雜聲音的時間精細結構在複雜音高處理中扮演一個角色(請參閱 Yost 等人,1996)。