Signal Analysis Method
(see
Dr. Kelly Fitz's site
for more)
The roughness model calculates the roughness of sound signals using spectral
information (frequency and amplitude values of a signal’s spectral components).
Spectral analysis in SRA uses an improved Short Time Fourier Transform (STFT) algorithm,
which is based on
reassigned
bandwidth-enhanced modeling (Fitz & Fulop, 2005; Fitz & Haken, 2002; Fitz et al., 2003; Fulop
& Fitz, 2006a,b, 2007), and incorporates an automatic spectral peak-picking process to
determine which frequency analysis bands correspond to spectral components of
the analyzed signal.
Frequency reassignment works differently from traditional Fast Fourier
Transform (FFT) and has more in common with
phase vocoder
methods.
For example, as in traditional FFT, frequency resolution of 10Hz will not be
able to resolve frequency components laying less than 10Hz apart. But,
unlike traditional FFT, the precision of the frequency values returned will not
be limited by this 10Hz "bandwidth," since the frequency band boundaries are
floating rather than being fixed. This a) fine-tunes the frequencies reported and b)
practically eliminates spectral smearing, since the method ensures that the
assumption of all energy being located at the high-frequency end of an analysis
band can be fulfilled.
Similarly, as in traditional FFT, a given analysis window length determines the
length of the shortest signals that can be reliably analyzed. But, unlike traditional FFT, the temporal resolution of a
signal's spectral (and therefore roughness) time-profiles will
not be limited by this "window length," since the frequency and amplitude
estimates are not time-window averages but instantaneous at the time-window's
center. This a) pin-points time with much higher precision than implied by the
window length and b) practically eliminates temporal smearing, since the spectra
estimated through time-window overlaps do not involve averaging over the entire
analysis windows (Fitz & Hacken, 2002; Fitz et al., 2003; Fulop & Fitz, 2006a,b, 2007).
In practical terms, spectral analysis results are fine-tuned through the
incorporation of a dual STFT process. Frequency values reported correspond to
the time derivative of the argument (phase) of the complex analytic signal
representing a given frequency bin. Similarly, time values reported correspond
to the frequency derivative of the STFT phase, defining the local group delay
and applying a time correction that pinpoints the precise excitation time.
Therefore, the Reassigned Bandwidth-Enhanced Additive Model shares with
traditional sinusoidal methods the notion of temporally-connected parameter
estimates of spectral components. By contrast, reassigned estimates are
non-uniformly distributed in both time and frequency, yielding greater temporal
and frequency precision than is possible via conventional additive techniques.
Parameter envelopes of spectral components are obtained by following ridges on a
time-frequency surface, using the reassignment method (Auger & Flandrin,1995) to
improve the time and frequency estimates for the envelope breakpoints.
Bandwidth enhancement expands the notion of a spectral component,
permitting the representation of both
sinusoidal and noise energy with a single component type. Reassigned bandwidth-enhanced
components are
defined by a trio of synchronized breakpoint envelopes, specifying the
time-varying amplitude, center frequency, and noise content (or bandwidth) for
each component. The amount of noise energy represented by each reassigned
bandwidth-enhanced spectral component is determined through
bandwidth association, a process of constructing the components'
bandwidth envelopes.
REFERENCES
(with links to the sources)
Auger, F. and Flandrin, P. (1995). "Improving the readability of time frequency and time scale representations by the reassignment method,"
IEEE Transactions on Signal Processing 43: 1068-1089.
Fitz, K. and
Fulop, S.A. (2005). "A unified theory of time-frequency reassignment,"
Digital Signal Processing (preprint).
Fitz, K. and Haken, L. (2002). "On the use of time-frequency reassignment in additive sound modeling,"
Journal of the Audio Engineering Society 50(11): 879-893.
Fitz, K., Haken, L., Lefvert, S., Champion, C., and O'Donnell, M. (2003).
"Cell-utes and flutter-tongued cats: Sound morphing using Loris and the Reassigned Bandwidth-Enhanced Model,"
Computer Music Journal 27(4): 44-65.
Fulop, S.A. and Fitz, K. (2006a). "Algorithms for computing the time-corrected instantaneous
frequency (reassigned) spectrogram, with applications,"
J. Acoust. Soc. Am. 119(1): 360-371.
Fulop, S.A. and Fitz, K. (2006b). "A spectrogram for the
twenty-first century," Acoustics Today 2(3): 26-33.
Fulop, S.A. and Fitz, K. (2007). "Separation of components from
impulses in reassigned spectrograms," J. Acoust. Soc. Am. 121(3): 1510-1518.
|