Psychoacoustics
MODULE 4

READINGS
Plank, 2005: Chapter 6

HOMEWORK


Lecture notes

 

t o p i c s
Perceptual attributes of acoustic waves - Loudness
        Introduction
        Loudness and frequency, spectrum, and duration
             Loudness scales
Intensity discrimination & Just Noticeable Difference (JND)
Adaptation, Fatigue, & Temporary Threshold Shift (TTS)

 

 


 

 

Perceptual attributes of acoustic waves - Loudness
Introduction

 

Loudness: Sonic (i.e. perceptual) attribute of sound waves related mainly to intensity.
In general, large intensity values result in 'loud' sounds while low intensity values result in 'soft' sounds.  More specifically, loudness is a perceptual manifestation of how the auditory system a) represents sound wave intensity and b) makes intensity comparisons among signals across frequency and time.  In terms of physiology, loudness may be a measure of the total activity of the basilar membrane.

Stevens's Power Law

Loudness is proportional to some power of Intensity.  L = kIa
For a large range of intensities and for frequencies >~200Hz,  a =~ 0.2-0.3.

For example, assuming a=0.3, the intensity of a sound must be increased tenfold for the loudness to be increased twofold, because:
If  L2/L1 = 2  then  (I2/ I1)0.3 = 2  ---  So  I2/I1 = 10  (because 100.3 = 2)  => I2= 10I1
(corresponding to a SIL increase by 10dB).

Reminders:

  1. Sound Intensity Levels (SILs) are measured on a logarithmic scale because a) perceived intensity is related logarithmically to the physical magnitude of intensity and b) logarithmic scales can represent very large ranges of values more efficiently.
    For example, 10dB SIL increase represents 10 times increase in intensity, 20dB SIL increase represents 100 times increase in intensity, and so on.
    The logarithmic unit of intensity is 1 decibel (dB).  Intensity in w/m2  and SIL in dB are related as follows:
         If the intensity is I w/m2 then SIL is 10*log10 I/10-12 dB.

  2. Both, intensity in w/m2 and sound intensity level in dB are equivalent measures of intensity as a physical variable. Neither intensity nor SIL measure loudness, which is a perceptual variable.

 

Dynamic Range of the ear: Lower and upper intensity limits that the auditory system can process effectively.
The dynamic range of (functional and safe) hearing extends from   ~10-12w/m2   or  0dB  (at 1000Hz)  to   ~1w/m2  or   120dB (at all frequencies). Sounds with intensity levels below 0dB are inaudible while sounds with intensity levels above 120dB can be damaging to the ear.

The lower limit of the ear's dynamic range (or absolute threshold) is usually explored and presented in the form of audiograms, obtained in loudness magnitude matching experiments (see the discussion on Phons, below).
Audiograms: graphs plotting the absolute threshold (i.e. lowest audible intensity) of sinusoidal signals as a function of frequency [see the lowest curve (bolded) in Fig. 6.1, below].

As the figure indicates, the lowest audible intensity is different for different frequencies [i.e. for different frequencies to sound equally loud (e.g. to be barely audible) their SIL or SPL has to be different].

The upper limit of the ear's dynamic range may be defined in several ways:
a) Intensity level at which we experience an unpleasant, even painful sensation, and which can cause physiological damage (~120dB)
b) Intensity level beyond which telling intensity level differences apart becomes very difficult.  This ability starts deteriorating at levels ~>100dB, but remains possible even at levels >120dB.

The dynamic range of the ear is therefore ~120dB in the mid-frequency region (1000-6000Hz) and decreases at low and high frequencies (at low and high frequencies, the lower limit increases while the upper limit stays relatively the same).
Visit this NIH page and select the "Hearing Response" tab to listen to examples of the relationship between dB and loudness at different frequencies.

 


 

Loudness and frequency, spectrum (bandwidth), and duration
Loudness scales

 

 

Effects of Frequency on Loudness  --  Loudness Level Unit & Loudness Unit

The equal loudness contours (Fig. 6.1, above and Fig. 1, below) illustrate that loudness does not only depend on intensity; it also depends on frequency and does so differently at different intensities.  More specifically, they demonstrate that
i) the sensitivity of the ear drops significantly at low (approx. <350Hz) and high (approx. >15kHz) frequencies and
ii) loudness depends on SIL non-linearly (i.e. the dependence of loudness on frequency is not the same at all SILs), because the shape of the curves changes (becomes flatter) as we increase intensity levels.

Fig. 1, below, is based on a series of 1930s experiments, while Fig. 6.1, above is based on recent revisions of and additions to these data (the main difference between the two graphs is for frequencies below 1000Hz).

According to these figures, for example, a pure tone with SIL = 60dB will sound moderately loud at 1000Hz but will be barely audible at 50 Hz. At high frequencies (approx. between 6000 and 15000Hz), the drop in sensitivity is not as dramatic. As we've already discussed, the fact that we are most sensitive to frequencies in the range 1000Hz - 6000Hz (approx.) may have some evolutionary significance, since speech sounds have most of their energy within this range (remember the discussion on the formant frequencies for different vowels).

 

Loudness dependence on frequency: Equal loudness contours

Figure 1: Equal loudness contours for pure tones (after Fletcher and Munson, 1933).

 
A variety of weighing curves have been devised (A, B, C, and D) to assist electronic/digital recording, measuring, and mixing equipment (with assumed flat, or near flat response) in simulating the ear's frequency-dependent response to intensity. These curves are mainly used in noise measurement, loudness calculations, and hearing evaluations and not in sound mixing and mastering.  Each type of curve captures best the ear's response at a different input level.  You will need to read the introductory paragraphs on weighing curves at Lindos Electronics (UK) and Wikipedia, and bookmark these resources for future reference.
 

Loudness level unit & Loudness unit

The loudness level unit is called Phon. Loudness level in Phons and SIL in dBs are equivalent only at 1000Hz, reflecting the fact that intensity and loudness do not map isomorphically, as loudness also depends on frequency. 
The loudness level of a given frequency in Phons is equal to the SIL of an equally loud 1000Hz tone. Points, therefore, on the same curve in Fig. 1 sound equally loud and are equal in Phone units but most likely unequal in SIL dB. 

For example, 200Hz at 20dB SIL, 100Hz at 10dB SIL and 3000Hz at ~0dB all sound equally loud (they all fall on or around the bottom contour line of Fig. 1).  This loudness equivalence is expressed in that all three frequency/intensity level pairs have the same loudness in Phons (10 Phons), even though their SILs are different.

Similarly to the lower limit of the ear's dynamic range, the Phon loudness level unit is determined based on loudness magnitude matching experiments.

Loudness magnitude matching: Listeners are asked to adjust the SIL of a tone with a given frequency so that it matches in loudness the loudness of a 1000Hz tone with a given SIL. 

Such experiments reveal the dependence of loudness on frequency but do not give us information that would help construct a loudness scale on which loudness could be measured.  Such a scale can be derived by loudness magnitude estimation and production experiments.

Loudness magnitude estimation: Listeners are asked to compare the loudness of two tones and determine how much louder/quieter the second tone is relative to the first.
Loudness magnitude production: Listeners are asked to adjust the SIL of a tone so that it becomes, say, twice as loud as another tone with known and fixed SIL.

Loudness magnitude estimation & production experiments use as standard reference a 1000Hz tone at 40dB SIL. The sound loudness level (SLL) of any tone is then described in relation to this reference standard, whose loudness is defined as:

1 Sone (Loudness unit): Loudness of a 1000Hz tone at SIL (or SPL) of 40dB.
The advantage of the Sone scale is that it is based on loudness units that are proportional to loudness and can be manipulated arithmetically rather than logarithmically (e.g. two sones sound twice as loud as one, three sones sound three times as loud, etc.). 
The sone scale is, in some respects, also more ecologically valid than the dB and phon scales, because it was derived based on complex as well as sine tone loudness judgments. Still, most discussions on loudness use SILs and dBs because of the complex transformations necessary to move from the more easily measured intensity or pressure to the much harder to derive, loudness.
 

Effects of Spectrum (bandwidth) on Loudness

For complex tones, loudness also depends on spectrum (i.e. on the way energy is distributed among the complex tone's components). Suppose that two complex tones have the same intensity level (in dB) but they have different spectra. The complex tone with more 'spread-out' spectrum (i.e. with components spreading across many critical bands) will sound louder than the one with a less 'spread-out' spectrum (i.e. with components spreading across fewer critical bands). 

If all frequencies in a complex tone are within a single critical band, the total loudness can be approximated by the sum of all intensities.
If the frequency components are spread over multiple critical bands, the total loudness is larger (see below) and can be approximated by the sum of the loudness levels per critical band.

We have already discussed the influence of bandwidth on the loudness of a noise band (see Module 3b, Masking slide #12 and Fig. 6.2, below).  As long as the noise bandwidth remains within the ear's critical bandwidth, relative to the noise's center frequency, changing the noise bandwidth while keeping the overall intensity level fixed will not result in a change in loudness (any increase due to the explanation below will be countered by decrease due to masking). 

For moderate to high SILs, increasing the noise bandwidth beyond that of a single critical band will result in an increase in loudness.

Explanation: (see Fig. 6.4, below). Spreading a given noise intensity over, say, double the bandwidth, will half the intensity exciting each relevant portion on the basilar membrane (B.M.).  At the same time, due to the B.M.'s compressive response at moderate-to-high SILs, the decrease in response will be less than the decrease in stimulus.  With total loudness being the sum of B.M. excitation per critical band, the decrease in SIL per B/M portion will be matched by the increase in the number of critical bands excited and overcompensated by a smaller decrease in local (i.e. per critical band) response than in stimulation. The final result will be a net increase in loudness.

The phenomenon of simultaneous masking (perceptual erasure of one sound by a more intense/loud sound) is an additional example of the dependence of loudness on spectral distribution. The disturbance pattern on the basilar membrane is wider and the response is faster for high intensity sounds than for low intensity sounds, facilitating masking. In addition, due to the general asymmetry in the disturbance pattern of the basilar membrane, strong low-frequency sounds will mask weak high-frequency sounds more efficiently than the other way around. See http://www.sengpielaudio.com/calculatorSonephon.htm for an online dB-Phons-Sones converter.

The dependence of loudness on frequency and spectrum and the phenomenon of masking play important roles in sound (data) compression algorithms.
 

Effects of Duration on Loudness

The auditory system responds to abrupt intensity changes with a "rise time" of  ~200ms and a "decay time" of ~50ms. The indicated "decay time" may explain why tones with frequency above 20Hz (50ms period) are perceived as a single tone with identifiable pitch rather than individual pulses or why amplitude fluctuation rates above 20Hz may be perceived as "roughness" or combination tones rather than as loudness fluctuations. 
 
In general, for durations up to ~200ms, (~400ms for broadband noise) the longer the signal the louder it appears (i.e. the ear averages sound energy over ~200ms for narrow band signals and ~400ms for broadband signals). Within this duration range, tenfold increase in duration corresponds to 10 Phons increase in loudness level.
For longer durations, no consistently reported loudness changes have been recorded.
 
Loudness and time-delay among successively presented stimuli:
Presentation delay between signals also influences loudness, but not in a significant way. You can deduce this influence from your understanding of forward and backward masking. We will return to this issue in our discussion on timbre.

 


 

 

Intensity discrimination & Just Noticeable Difference (JND)
 

JND (just noticeable difference) for loudness: approx. 1dB (for pure tones, middle frequencies and 'regular' background noise)
 

READ SECTION 6.3 OF THE TEXTBOOK (see below for a summary/discussion of the section)

Weber's law

According to Weber's (fraction) law, the smallest perceivable physical change in a stimulus is proportional to the starting level of the physical stimulus (called pedestal), so that their ratio remains constant (see Plack, 2005, Fig. 6.5) . Therefore, for pedestal intensity of I and intensity change of ΔI , Weber's law says that
ΔI / I = constant.

Weber's law 'near miss'

As indicated in Fig. 6.6, below (Plack, 2005), this is not precisely the case when it comes to the way the hearing mechanism perceives intensity changes at different starting intensity levels (i.e. at different pedestals). Instead of the lines in the graph being parallel to the x axis (representing a constant ΔI / I , per Weber's law), they slope downwards, indicating that ΔI / I is reduced with increasing pedestal intensity level (i.e. sine tone level discrimination improves with increasing sine tone level).  This observed departure from Weber's law is referred to as Weber's law 'near miss'.

 
The problem with this departure is not so much that it goes against Weber's law (essentially a rule created to model in an as simple manner as possible a set of complex observations), but that it contradicts expected behavior of the auditory system based on basilar membrane and neural response saturation factors. These factors would suggest that, rather than remaining constant (per Weber's law) or decreasing (per observations), the ratio ΔI / I , should be increasing (i.e. discrimination should deteriorate) with increasing sine tone level.

Possible explanations to Weber's law 'near miss' (1-3; issue still unresolved) and for loudness neural coding in general (1-4)

  1. Low spontaneous rate fibers with wide dynamic range (>60dB) and slow saturation rate help improve intensity discrimination at high pedestal intensities - see Plack, 2005, Fig 6.9 (however, there are not enough such fibers -less than ~10% of the total number of fibers- to account for the observed intensity discrimination improvement).
     

  2. As indicated by the basilar membrane response pattern, as pedestal levels increase, excitation spreads over an increasingly larger area on the B.M., exciting increasingly more unsaturated fibers, which help with intensity discrimination (however, experiments introducing noise that should mask the B.M. areas that would provide this additional intensity discrimination information do not reduce performance).
     

  3. Outer hair cell action results in increasingly compressive B.M. response with increasing input intensity levels, delaying nerve fiber saturation and permitting nerve fibers to assist in intensity discrimination tasks at high input levels (however this would only explain why intensity discrimination does not deteriorate as much as expected and not why it improves with increasing input level, up to ~100dB, and remains relatively constant above that).
     

  4. (applicable only to loudness neural coding)  Intensity values (intensity differences) are coded based on the firing rate (firing rate change) of neural fibers excited in response to an incoming signal and the associated motion of the basilar membrane.

 


 

 

Adaptation, Fatigue, & Temporary Threshold Shift (TTS)

View the slides on adaptation shown in class.
Click here for a printable copy of the slides (6 slides per page - 2 pages), also displayed below.


 

 


  

Columbia College, Chicago - Audio Arts & Acoustics