Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How does multi microphone filtering work? I guess they localize different sound sources by cross-correlation (to get the timings) and triangulation (based on the timings and the speed of sound)?


I think the (or perhaps only one) key phrase is "beamforming". A single microphone element has a certain sensitivity pattern (e.g. it may be a very directional microphone, or be equally sensitive in all directions). With multiple pick-ups, you can emulate some different sensitivity patterns.

A related idea in radar is synthetic-aperture radar (SAR).


Great point.

A lot of the interesting things in audio were inspired by radar. Dan Wiggins at Sonos used to work on radar, and Don Keele created a loudspeaker technology called "CBT" that's based on radar technology.

Because microphones are basically the inverse of loudspeakers, what works in loudspeaker arrays can also work in microphone arrays.


It's pretty neat! Here's how it works:

When you record with a single microphone, you are going to pick up a great deal of background noise. This is because the mic will pic up the person speaking AND the background noise; there's no way to differentiate the two.

With two microphones, we know the following:

1) we know where the microphones are

2) we have a general idea where the persons mouth is, because we know how they hold the phone

Based on that, we have a good idea of how long it should take for the sound to arrive, because the speed of sound is a fixed number.

The first time I ever heard a dual mic phone was when one of my coworkers made a call from the inside of our data center. Typically, he'd have to shout into the phone, because the data center was so noisy, and worst of all, the noise was completely random and broadband. But with dual mics, poof, background noise is gone. It was almost like he was speaking in a quiet room.

Amazon Alexa takes this quite a bit further, and uses something called "beamforming." What beamforming allows you to do is to determine WHERE the person is in the room, based on the arrival times of the sound. It's sort of the inverse of a dual mic setup; in a dual mic setup we can 'clean up' the signal because we know where the person speaking is. In a beamforming arrangement, we can use the arrival times to FIGURE OUT where the person is in the room.

If some security company was clever, they could probably use a beamforming microphone array to train a camera on people in the room.

And keep in mind, Alexa beamforming is two dimensional, but you could go crazy and do a 3D beamforming array if you wanted to! (Alex only knows where you are on a horizontal plane.)


That does sound neat. Sounds like it could be combined with 3D localization to allow eavesdropping particular sounds sources even e.g. in large rooms with lots of people talking. Multipath/ghosting might make precise localization difficult though.


You can do this with ICA (Independent component analysis, a somewhat lesser known, non-Gaussian cousin of principal component analysis). Basically you take the data with multiple components and break it down into its consistent component parts.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: