A/B tests are often seen as the ultimate method for an evidence-based and data-driven approach to designing websites and apps. This is because they are the purest form of scientific testing: a straight comparison of one design against another. It’s not always the right solution though, as I'll explain.
The theory behind them is that you take a new web page design (version B) and serve it up to to some of your users whilst showing the rest of your users the original design (version A). The differences could be anything from a new version of a button to a complete page redesign. You then measure to see which version provides a better rate of conversion, and the winner is put live on the site to all of your users.
You can also measure secondary goals and other interactions to see if the new version has had an effect on more than just your main conversion rate. Something might not increase page conversions but might improve another desirable metric. You can also do multi-variate testing where you test out more than just two options.
In principle this type of testing allows you to measure the success of your designs in the real world and with your actual users. In practice it is somewhat more complex than that (see the 'watch out for' section) and isn't something that should be undertaken lightly. To do so risks getting inaccurate results and can cause you to make the wrong decisions for your product so it's well-worth getting a professional data analyst to do the work.
You're going to need to install some code from your chosen testing tool and, as with analytics, it's a pretty straight-forward task of copy and pasting. Once installed you can then use the software to set up your A/B tests.
Define the hypothesis for your test. What are you changing and what do you think it will do? What is the primary metric you are looking to change? Are there any secondary metrics you’d be happy with improving?
Work out how much traffic you're going to need to get a result, and thus how long you need to run the test for. There are calculators to help you do this. This is important as many websites won’t actually have enough for this (see below) and you could discover that A/B testing is an impractical choice.
Check that the test works on a few different browsers and is being shown to the right subset of users. You often don’t want everyone to see a new variation—a lot of the time it makes sense to show changes to new users and not change the experience for returning ones.
Set the test running and try and leave it alone for the duration of the test. It's worth checking every so often to be sure it hasn't been a disaster and doesn't need stopping early. Otherwise let it go and don't be fooled into thinking you've got a result until you've had the required number of people go through the test (the number your calculator indicated).
When you get a result, roll out the winner, in the exact form it was tested. Sometimes this will mean sticking to the existing design. Quite often there will be no meaningful difference between the two versions, so in theory it's your choice as to which you go forward with.
This is the biggest problem for a lot of startups and small sites and it's not as simple as knowing how much traffic you get to the website overall. Even with 100,000 users a month you may not have enough traffic to run the test you want in a reasonable time. Let me explain through an example:
Let's say you want to get more users to reach checkout from your product page and so you redesign it. Your current conversion rate of that page is 5% and to consider this a success you want that to increase by 10% to 5.5%. This means with a statistical significance of 90% (which isn't amazing, 95% is more commonly used) you need 30,000 individual visitors to go through your test per variation to be sure you know whether it's 10% better.
In an A/B test you'll need 60,000 users in total to go through your test. If you have decent traffic of 100,000 users visiting your site per month and your product pages only get about a third of that traffic, then you're going to need to run that test for two months before you have a result.
Here's the problem. Two months is a long time for a lot of companies and they would probably be better off gathering several other forms of evidence (such as visitor recordings, guerrilla user testing, conversion funnels etc) in that time, which will give lots of areas for improvement.
The biggest problem with A/B testing is that people use it at the wrong time. Too often they have already redesigned their site and built it and then are just testing to see how much better it is than the current version.
A/B tests can be used at the very end of a project when a company has already put the time and money in and are not interested in knowing whether it is actually worse. They just want a number to boast about how much better the new one is.
Ultimately, to properly run A/B tests involves a good knowledge of statistics and an experience in doing it before. There are lots of things to understand like sample sizes, statistical significance, statistical power, one/two-tailed tests and more to know if you're doing the right things.
Taking a punt and doing it on your own almost guarantees that you'll make mistakes. I know, I've been there. Some software can be very reassuring and make you think you're getting great results but when you come to launch them you're left with something that doesn't work.
There are many other things to watch out for in A/B testing, which are solved when you get an A/B testing pro to help you out.
There are a lot of tools out there now for running A/B tests at many different price points. I used to use Optimizely but that has moved to be a more enterprise solution and not so budget friendly. I’d recommend digging into an article like this to find the right tool for you.
Once your designs are ready, setup should be a matter of an hour or two. Running a test can take a long time (often several weeks).