If you make an interface that automatically runs this on visitor's browsers and populates a form that users can hit "Submit" (or alternatively, simply make an AJAX call), you can get hundreds of diff OS / Browser Version / Hardware combinations.
Aggregating these numbers should give you really nice insights into diff VMs / platforms etc.
I considered this but there weren't any existing services that did this and allowed me to make the type of tests I needed to make (and I didn't want to spend unnecessary time building one). I also ran the benchmarks on Firefox and though the numbers varied, the overall trend was the same. I don't imagine it would be much different for IE.
Given the motivation of the benchmark, Chrome was of main concern for me. :)
That's one of the ones I tried, but I needed some tests to create new objects, and others to reuse the same object which was already created previously, and I wanted to compare the numbers together rather than in isolation.
To use jsperf, I would have had to come up with many permutations of each test at the very least. Not ideal.
Aggregating these numbers should give you really nice insights into diff VMs / platforms etc.