Information theory based Gestalt Grouping

Information theory based Gestalt Grouping

Information-Theoretic Listening Paris Smaragdis Machine Listening Group MIT Media Lab 02/24/20 Outline Defining a global goal for computational audition Example 1: Developing a representation Example 2: Developing grouping functions

Conclusions 2 Auditory Goals Goals of computational audition are all over the place, should they? Lack of formal rigor in most theories Computational listening is fitting psychoacoustic experiment data

3 Auditory Development What really made audition? How did our hearing evolve? How did our environment shape our hearing? Can we evolve, rather than

instruct, a machine to listen? 4 Goals of our Sensory System Distinguish independent events Object formation Gestalt grouping Minimize thinking and effort Perceive as few objects as possible Think as little as possible 5 Entropy Minimization as a

Sensory Goal Long history between entropy and perception Barlow, Attneave, Attick, Redlich, etc ... Entropy can measure statistical dependencies Entropy can measure economy in both thought (algorithmic entropy) and information (Shannon entropy) 6 What is Entropy?

Shannon Entropy: H(x) Px (x)log Px (x)dx A measure of: Order Predictability Information Correlations

Simplicity Stability Redundancy ... High entropy = Little order Low entropy = Lots of order 7 Representation in Audition Frequency decompositions Cochlear hint Easier to look at data!

Sinusoidal bases Signal processing framework 8 Evolving a Representation Develop a basis decomposition Bases should be statistically independent Satisfaction of minimal entropy idea

Decomposition should be data driven Account for different domains 9 Method Use bits of natural sounds to derive bases s k 1 reshape

S n m Analyze these bits with ICA S W X W(i ) indep of W( j )i, j 1 Results We obtain sinusoidal bases! Transform is driven by the

environment Uniform procedure for different domains 1 Auditory Grouping Heuristics Hard to implement on

computers Require even more heuristics to resolve ambiguity Weak definitions Bootstrapped to individual domains Good Continuation Common AM Common FM Vision Gestalt Auditory Gestalt 1

Method Goal: Find grouping that minimizes scene entropy Parameterized Auditory Scene s(t,n) Density Estimation Ps(i) Shannon Entropy Calculation H(s) Ps (i,..., j)ln Ps (i,..., j) i,...,j 1

Common Modulation - Frequency Scene Description: Entropy Measurement: s(t,n) {cos(r f1 (t) t),cos( f2 (t)t)} f1 f2 if n 0.5 Frequency n = 0.5 Time 1

Common Modulation - Amplitude Scene Description: Entropy Measurement: s(t,n) {a1 (t)cos(r f0 t), a2 (t)cosf 0 t } Sine 1 Amplitude Sine 2 Amplitude a1 a2 if n 0.5 n = 0.5

Time 1 Common Modulation - Onset/Offset Sine 2 Amplitude Scene Description: Sine 1 Amplitude Entropy Measurement: n = 0.5

Time 1 Similarity/Proximity - Harmonicity I Scene Description: Entropy Measurement: s(t,n) {cos( f0 t),cos(n f0 t)} Frequency Time 1

Similarity/Proximity - Harmonicity II Scene Description: Entropy Measurement: Frequency s(t,n) {cos( f0 t),cos(2 f0 t),cos(nf0 t )} Time 1 Simple Scene Analysis Example

Simple scene: 5 Sinusoids 2 Groups Simulated Annealing Algorithm Input: Raw sinusoids Goal: Entropy minimization Output: Expected grouping 1 Important Notes No definition of time Developed a concept of frequency

No parameter estimation requirement Operations on data not parameters No parameter setting! 2 Conclusions Elegant and consistent formulation No constraint over data representation

Uniform over different domains (Cross-modal!) No parameter estimation No parameter tuning! Biological plausibility Barlow et al ... Insight to perception development 2 Future Work Good Cost Function? Incorporate time

Joint entropy vs entropy of sums Shannon entropy vs Kolmogorov complexity Joint-statistics (cumulants, moments) Sounds have time dependencies Im ignoring Generalize to include perceptual functions 2 2 arg min (H(s(t),template(t))) Dissonance and Entropy

Pitch Detection Instrument Recognition template arg min (H(s(t),cos( f (t) t))) f H(5th | pythagorean) H(5th | equal temperament) H(Maj chord ) H(Min chord ) H (Dim chord ) Teasers

Recently Viewed Presentations

  • Fallacies

    Fallacies

    Ecological Fallacy. A famous (purported) instance of the ecological fallacy was Durkheim's argument that since suicide rates in Catholic countries were lower than in Protestant countries, Catholics were less likely to commit suicide than Protestants.
  • Lecture1 Introduction

    Lecture1 Introduction

    EECS150 - Digital Design Lecture 15 - Sequential Circuits II (Finite State Machines revisited) March 14, 2002 John Wawrzynek
  • Narrative Summary Basics Major Jared Brandt, MPAS, PA-C

    Narrative Summary Basics Major Jared Brandt, MPAS, PA-C

    Medical Standards Directory (MSD) Look up the condition and VERIFY the NARSUM is even required!!!! ... Reference AFI 36-2910 for (a) whether or not this is an Administrative LOD determination or (b) if this requires an AF Form 348 LOD...
  • The Challenge: To Create More Value in All Negotiations

    The Challenge: To Create More Value in All Negotiations

    Relationships (of all varieties): THERE ONCE WAS A TIME WHEN A THREE-MINUTE PHONE CALL WOULD HAVE AVOIDED SETTING OFF THE DOWNWARD SPIRAL THAT RESULTED IN A COMPLETE RUPTURE. I believe this is true 100% of the time. Most of my...
  • Is LiHoF a Quantum Magnet?

    Is LiHoF a Quantum Magnet?

    Correlation length - experiment Remarks Hyperfine interaction: electro-nuclear Ising states Hyperfine interaction: electro-nuclear Ising states Hyperfine interaction: electro-nuclear Ising states Enhanced transverse field - phase diagram Re-entrance of crossover field Significance of the hf in the LiHo Electro-nuclear entanglement entropy...
  • IL RAPPORTO SULLO STATO DELL'AMBIENTE 2008 ED IL ... - Trentino

    IL RAPPORTO SULLO STATO DELL'AMBIENTE 2008 ED IL ... - Trentino

    Le principali tematiche - Clima ed energia/1 Il surriscaldamento climatico Le analisi delle serie storiche strumentali di stazioni centenarie indicano che nell'ultimo secolo in Trentino la temperatura media è aumentata di 0,6°C ± 0,16°C Nel 2007 la Provincia di Trento...
  • S1 Specified / Implied Tasks Specified  Implied S2

    S1 Specified / Implied Tasks Specified Implied S2

    Title: PowerPoint Presentation Last modified by: Mentor Created Date: 3/10/2003 6:17:06 AM Document presentation format: On-screen Show (4:3) Other titles
  • "Credit Frictions and Optimal Monetary Policy" by Vasco ...

    "Credit Frictions and Optimal Monetary Policy" by Vasco ...

    "Credit Frictions and Optimal Monetary Policy" by Vasco Cúrdia and Michael Woodford Discussion by Miles Kimball May 28, 2008 Bank of Japan Conference Six Conclusions of the Paper Financial frictions and financial shocks can be assimilated smoothly into the Basic...