Thank you. In my entire existence as a business I have never seen someone by X l...

paraschopra · on Feb 18, 2009

The image clearly shows that the result is statistically insignificant. This is very important as you could be misguided. The general thumb of rule for running full factorial A/B test (like the ones GWO does) is that you require about a baseline level of million page views with about 5% conversion rate in order to get successful results in one week (including weekends).

patio11 · on Feb 18, 2009

My understanding of the word "full factorial" means that you have multiple design elements under test at the same time and you're testing all possible combinations. For example, you are testing two alternative images and two alternative headlines. This gives you 2 x 2 = 4 possibilities to show to any given user. As you increase the alternatives for each factor and increase the number of factors, the total number of alternatives grows in a combinatorially explosive fashion and you might indeed need a million page views and 5% conversion rates. (Heck, six factors with 6 options each and even with a million viewers you'd have less people seeing each combination than I did, unless you started pruning them early.)

But I'm still only testing two alternatives of one factor. I mean, yes, that is included in the definition of "full factorial" but it makes an absolute hash out of that rule of thumb. Two choices total means the stats test is simple and does what it says on the tin: 94% chance that new cart outperforms old cart, exact magnitude of outperformance bounded by calculable confidence intervals.

You can consider 94% insignificant or significant -- your call really. If you chose p = .05, its insignificant. If you choose p = .1, its significant. It costs me very little (except opportunity costs) to keep the experiment going but 94% is good enough for me personally to claim a win out of it.

paraschopra · on Feb 18, 2009

Although in the image, nowhere could I see p value, I have verified it by myself that there is 95% significance. And I agree with you it depends on perspective on what you consider significance.

BTW, those wanting to know the math head to http://20bits.com/articles/statistical-analysis-and-ab-testi...

c3o · on Feb 18, 2009

Sorry, should have been more specific: I meant that you use absolute $ values in your sales-by-month graph -- that must also be dependent on traffic. Obviously the A/B test is not.