Interesting comparison, but only using generated mixtures of gaussians for train...

robrenaud · on June 25, 2012

I was impressed that random forests did so well for the irrelevant feature detection, given that they know nothing about gaussians. Though IIRC, you've used them to win Kaggle competitions, so maybe you already know their power.

ogrisel · on June 25, 2012

The author of the blog post has promised more extensive tests with other datasets in future posts. Also there is an ongoing GSoC project for setting up a systematic benchmark infrastructure + performance optimizations for the scikit-learn project. So let's wait and see. We will soon have more meat for finer analysis.