I am trying to do some statistical analysis of various A / B tests to see which alternative is better, and found conflicting information about this.
Firstly, I’m interested in a couple of different things:
- Tests that measure success by counting events, such as conversions or sent emails
- Tests measuring success by counting income
- Tests that have only two alternatives (control and new)
- Tests that have several alternatives (management and several new ones)
I was hoping to find a simple set of formulas or rules for this analysis, but found more questions than answers.
This site says you cannot compare multi-element tests; you can only make paired comparisons and analyze a chi-square to see if the whole test is statistically significant or not.
This site offers a way to test A / B / C / D (starts with slide 74) by analyzing the results using the G-Test (which he says is related to the chi-square), but is unclear in the details of using the fiction factor. It also suggests that you can use the A / B / C / D approach to eliminate alternatives until you get a clear winner in the A / B comparison.
A/B/C/D ( ) , . , , ( ).
, , , , . , / . , .