Sound – Complexity – Incompleteness

Is sound predictable, or computable? What distinguishes an organized or “complex” sound? Although the literature on complex systems is vast, research on the application of complexity measures to sound is notably scarce. Sound certainly has the potential to a “complex approach”, particularly for its double nature as acoustic entity (as studied by physics) and as conscious experience (as studied by cognitive science), which might suggest a “multilevel hypernetwork”. One of the few attempts to measure the “complexity” of sound [1] is based on spectral flatness (which indicates whether the energy distribution is smooth or spiky), defined as the ratio between the geometric and the arithmetic means. In this straightforward model, the rate of change of spectral dynamics is the only variable in consideration for the classification of sounds as “simple” or “complex”. Although evidently too simplistic, this model has some practical advantages. It is presumably difficult to measure, for instance, the logical depth or the Kolmogorov complexity of a Beethoven recording, or of any sound produced by a system evolving under Natural Selection for billions of years (not to mention cultural evolution). It is hard enough to produce realistic acoustic simulations using physical modelling synthesis, digital waveguides or other solutions to the wave equation. But it is even harder to imagine an algorithm that would produce the number sequence required to digitally encode a symphony other than a copy of the sound samples of the original recording, which amounts to a string as long as its source. In information theory, Shannon’s entropy is a common starting point to investigate “complexity”. To compare sounds of different durations, we require a measure of entropy that is independent of the sequence length. While entropy tends to increase in a closed system, complexity intuitively increases only at first, decreasing when approaching equilibrium. To investigate this behavior and expand on the model [1], this paper explores “sonic complexity” through an analysis of large numbers of properties related to timbre, such as entropy, flatness, mfcc, roughness and irregularity at different orthogonal representations (time domain, spectral domain, cepstral domain and wavelets). Presently, there are two leading paradigms to digitally codify sound: 1) an ordered set of natural or real numbers representing variations in pressure 2) an ordered set of complex numbers representing the amplitudes and phases of complex exponentials at different frequencies. Each representation can be derived from the other using the Fourier Transform. From the short-time Fourier transform it is possible to obtain not only the Cepstrum (the Fourier transform of a log Fourier Transform), on a frame-by-frame basis, but also the ‘spectral fluctuation’, by taking a second spectrum estimation of each spectral band, while simultaneously accounting for masking effects. We argue that a multilevel approach combining advanced methods of music information retrieval, machine learning and cluster analysis can reveal new insights towards measuring the complexity of sound.

Συνεδρία: 
Authors: 
João Carrilho
Room: 
1
Date: 
Monday, December 7, 2020 - 13:35 to 13:40

Partners

Twitter

Facebook

Contact

For information please contact :
ccs2020conf@gmail.com