OK, since you asked: > The sound wave produced by a piano note is a simple sine ...

tobr · on Nov 6, 2013

>> The sound wave produced by a piano note is a simple sine wave.

> No, it's not. A piano note is a complicated stack of overtones (some of which are harmonic and some of which aren't) and transients. If it was just a sine wave, it would sound like a sine wave and not like a piano.

This is really such a fundamental* aspect of frequencies and sound, it should have been the starting point of the article. "Look how noisy and squiggly this wave is, but with FFT we can see that it really only consists of waves at multiples of the same frequency".

* no pun intended

aatish · on Nov 6, 2013

This is actually really valuable feedback and one of the things I like about the HN audience. I struggle to strike a balance between precision and simplicity, and your comments raised many interesting subtleties that I had overlooked. So thanks for helping me get a better handle on this.

chinpokomon · on Nov 6, 2013

Auditory masking is more than that. The psychoacoustic model used for MP3s will discard some frequencies during others, like you said, but it will also discard "information" proceeding and following an acoustic event.

While not a perfect analog, it is similar to how vision works. If you are in a bright room and the lights go out, it takes you a little time to readjust to the darker room before you can see things again. While your irises were adjusting, the chair and take were still there, and light reflecting from those objects were still reaching your eyes, but you were unable to see them. It works the other way too. If you are still in that dark room and the lights come on, it will take your eyes a little time to readjust, less time then it took your eyes to adjust to the dark, but still your eyes are effectively discarding that information.

The psychoacoustic model in MP3s is even more bizarre. It turns out that some frequencies, when heard BEFORE another frequency, your brain will throw that first frequency away. It is unintuitive, but that has been proven in laboratory settings. Knowing how the majority of humans discard the same auditory information under different conditions, MP3s are compressed by throwing away the same information that the brain would normally discard. The Fourier transform is a significant part, but it isn't the whole story.

pistle · on Nov 6, 2013

I love the FFT even more than you and enjoyed that it is getting lauded, but would have found more algorithmic details even more interesting. Breaking down DFT, etc. and then showing the performance magic of FFT is a great way to approach discussion of many issues in problem analysis and algorithm design.

So, Nice enough article for slipping into the topic - now give me more! harder! faster!

exg · on Nov 6, 2013

Sparse fast Fourier transform is even more "magical" than fast Fourier transform (FFT). If you assume that the discrete Fourier transform (DFT) has only k non zero coefficients, then, there exists an algorithm to compute it in O(k log(n)). That's right, you do not have to see the entire signal to compute the DFT, which is pretty awesome.

If you are interested, see http://groups.csail.mit.edu/netmit/sFFT/ .

susi22 · on Nov 6, 2013

I work in this field (though I'm none of the authors). If anybody has any questions about the sFFT, let me know.

taejo · on Nov 6, 2013

> This is actually the exact same number of numbers as in the time-domain series.

In a way, it's twice as many, since they have both real and imaginary components.

kkjkok · on Nov 6, 2013

Nope - the amount of numbers is in fact identical. The discrete Fourier transform has redundant information in bins above NFFT/2 (where NFFT is the size of the transform/number of samples in the signal) for purely real valued input. This means the "number of numbers" is the same - NFFT/2 bins of necessary complex values after transformation, versus NFFT real values in the original series.

taejo · on Nov 11, 2013

Ah yeah, forgot about that. Thank you.