Sunday, September 25, 2011

A/B Testing

Many websites use a type of software testing called A/B testing or split testing. The objective of this testing is to determine and positively influence the user experience on the website. It involves testing distinct design layout options of the website. A/B testing is also performed on non-web elements of the website such as emails. Many large companies use A/B testing. So, what is A/B testing?

A/B Testing

A/B testing determines the better of two content or design options on real users using web analytics tools. It requires an existing benchmark to measure against. For example, let us say that you test an ecommerce website. Looking at the website logs or analytics, you know that only 20% of the users who start a checkout process actually complete it. You suspect that your multi-page checkout process could be a cause of checkout abandonment. Instead of directly changing the checkout process to single-page, you decide to execute an A/B test. For this, you set up two checkout options - the current multi-page and the new single-page checkout. From now on, 50% of the users who start the checkout get the multi-page checkout option and the others get the single-page checkout. In order to not confuse the users, you record which option was presented to which user and continue to provide the same option to the users on their repeat visits too. You monitor the test results for a month. Then you analyze the results and take necessary action.

Possible test results
1: Multi-page Complete = 10%, Single-page Complete=30%: This is strongly in favor of the single-page option. So, you decide on deploying this option.
2. Multi-page Complete = 18%, Single-page Complete=22%: This is a minor difference. Maybe, the cause of low checkout completion is not the number of pages. So, you decide to look for other elements and then run a different A/B test.
3. Multi-page Complete = 28%, Single-page Complete=12%: This is in favor of the existing option. So, this is not the cause of low check out completion. You decide to test another element e.g. the actual text on the checkout pages.
4. Multi-page Complete = 10%, Single-page Complete=12%: This is marginally in favor of the new option. But, more urgently you need to find out the cause of both options performing below the benchmark.

Points to keep in mind
a. The two options should have only a very limited number of differences. If there are many differences, it is hard to pin-point the improved element with accuracy.
b. The sample sizes in A/B testing should be statistically significant. For example, the results based on an A/B test on 200 users is realistic. A test on just 5 users is not.
c. The two options should be tested simultaneously, rather than one after the other. If tested consecutively, it is possible for other factors (e.g. changed user patterns/ demographics) come into play and this may skew the results.
d. Instead of just two options, multiple options can be tested. In this case, it will be called A/B/N testing (for N options).
e. Each and every element of the web application interface is a candidate for A/B testing. The elements to be considered include home page, actual text, font faces and sizes, colors, images, links, linked pages, placement of elements and so on.
f. A/B testing should be done on an ongoing basis. After improving one element of the user experience, the next should be targeted and tested.

I hope that you now understand A/B testing. There is an A/B test currently in progress on the home page of this blog. Can you spot it?