Mobile A / B testing can be a powerful tool to improve your application. It compares the two versions of the app and sees which one works better. The result is insightful information about which version works better and a direct correlation with the causes. All the best applications in the mobile industry use A / B testing to confirm how the improvements or changes they make to their application directly affect user behavior.
Although A / B testing is becoming much more productive in the mobile industry, many teams are still unsure of how it can be effectively implemented in their strategies. There are many guides to getting started, but they don’t cover many pitfalls that can be easily avoided – especially for mobile devices. Below are six common mistakes and misunderstandings and how to avoid them.
1. Follow-up of events during the funnel
This is one of the easiest and most common mistakes teams make today in A / B mobile testing. Often, teams perform tests that focus on adding just one meter. While there is nothing inherently wrong with this, they need to be sure that their change will not negatively impact their key performance indicators, such as additional sales or other results.
For example, suppose your team tries to increase the number of users who register for the app. They theorized that unregistering email and using only Facebook / Twitter credentials would increase the number of registrations made in their entirety because users would not have to enter usernames and passwords manually. They track the number of users who have registered to the variant with and without email. After testing, they see that the total number of registrations actually increased. The test is considered successful and the team will post the change to all users.
The problem, however, is that the team does not know how it will affect other important metrics such as engagement, retention, and results. Because they only tracked registrations, they don’t know how this change will affect the rest of the app. What if users who sign in using Twitter uninstall the app soon after installation? What if Facebook-registered users buy less premium features for privacy reasons?
To avoid this, all teams must perform simple inspections. When driving a mobile A / B test, be sure to track other information below the funnel to help visualize other parts of the funnel. This will help you get a better idea of the impact of the change on user behavior throughout the application and avoid a simple mistake.
2. Ending tests too early
The (almost) ability to use Quick Analytics is great. I love being able to open Google Analytics and see how traffic is directed to specific pages, as well as general user behavior. However, that is not necessarily a great thing for A / B testing of mobile devices.
Because testers want to check results, they often stop testing far too early as soon as they see a significant difference between variants. Don’t fall victim to this. Here’s the problem: Statistics are most accurate when given time and many data points. Many teams run the test for a few days and constantly check their dashboard to see progress. As soon as they receive information that confirms their hypothesis, they stop the test.
This can lead to false positive results. Tests need time and multiple data points to be accurate. Imagine throwing a coin five times and you get all the ends. Unlikely, but not unreasonable, right? You can then wrongly conclude that every time you flip a coin, it lands 100% of the time. If you turn the coin 1000 times, the chances of turning all the ends are much lower. It is much more likely that you will be able to estimate the actual probability of a coin flipping and landing with more attempts. The more data points you have, the more accurate the results will be.
To minimize false positives, it’s best to design your experiment to run until a predetermined number of conversions and elapsed time are reached. Otherwise, you will greatly increase your chances of getting false positive results. You don’t want to base future decisions on incorrect data because you stopped the experiment early.
So how long should you run the experiment? It depends. Airbnb explains below:
So how long should the experiments last? To prevent a false negative (type II error), the best practice is to determine the minimum effect size you care about and calculate based on the sample size (number of new samples coming in each day) and the certainty you want. , how long the experiment will run before you start the experiment. Setting the time in advance also minimizes the likelihood of finding a result where it does not exist.
3. Creating a test hypothesis
The A / B test is most effective when performed in a scientific manner. Do you remember the scientific method taught in elementary school? You want to control foreign variables and isolate changes between variants as much as possible. Most importantly, you want to create a hypothesis.
Our goal with A / B testing is to create a hypothesis about how change affects user behavior, and then test in a controlled environment to determine the cause-and-effect relationship. That is why creating a hypothesis is so important. The hypothesis allows you to decide which metrics to monitor and what indicators you should look for to show a change in user behavior. Without it, you just throw spaghetti on the wall to see what’s sticking, instead of getting a deeper understanding of your users.
Create a good hypothesis by writing down which metrics you think will change and why. If you integrate a social application deployment tutorial, you might assume that adding it will reduce bounce rates and increase engagement metrics, such as messages sent. Don’t skip this step!
4. Make changes to test results for other applications
When reading about A / B tests in other applications, it is best to interpret the results with salt. What works a competitor or similar application may not work for you. The audience and functions of each application are unique, so the assumption that users respond in the same way may be an understandable but critical mistake.
One of our customers wanted to test a change like its competitor to see its impact on users. It’s a simple and easy-to-use dating app that allows users to browse users ’“ cards ”and like or like other users. If both users like each other, they are connected to each other.
The default version of the app had thumbs up and thumbs down icons for likes and dislikes. The team wanted to test the change, which they believed would increase engagement by making the like and dislike buttons more empathetic. They saw that a similar app used heart and x icons, so they believed using similar icons would improve clicks, and created an A / B test to see.
Surprisingly, the heart and x icons reduced liking button clicks by 6.0% and liking button clicks by 4.3%. These results were a complete surprise to the team, which expected the A / B test to confirm their hypothesis. It seemed sensible that the icon of the heart instead of the thumbs better represented the idea of finding love.
The client’s team believes the heart really represented a commitment to a potential match to which Asian users reacted negatively. Clicking on the heart symbolizes love for the unknown, while the thumb up icon only means accepting the match.
Instead of copying other applications, use them for testing ideas. Borrow ideas and take customer feedback to customize your app test. Then use A / B testing to validate these ideas and implement the winners.
5. Testing too many variables at once
A very common temptation is for teams to test multiple variables at once to speed up the testing process. Unfortunately, this almost always has the opposite effect.
There is a problem sharing users. In the A / B test, you must have enough participants to get a statistically significant result. If you test on more than one variable at a time, you will have exponentially more groups based on all possible combinations. Tests are likely to take much longer to find statistical significance. It will take much longer before you gather interesting information about the test.
Instead of testing multiple variables at once, make only one change per test. It takes a much shorter period of time and gives you valuable insight into how the change will affect user behavior. This has a huge advantage: you can take lessons from one test and apply it to all future tests. By making small iterative changes through testing, you get more information about your customers and you can combine the results using that information.
6. Abandonment after a failed Mobile A / B test
Not all tests give you good results to praise. Mobile A / B testing is not a magic solution that throws amazing statistics every time they are run. Sometimes you only see marginal returns. At other times, your key information will be reduced. It doesn’t mean you’ve failed, it just means you have to consider the hypothesis you’ve learned.
If the change does not yield the expected results, ask yourself and your team why and then proceed accordingly. Even more important is to learn from their mistakes. Often our failures teach us much more than our successes. If the test hypothesis does not work as expected, it may reveal some assumptions that you or your team make.
One of our customers, the restaurant reservation app, wanted to show more prominently the offers of restaurants. They tested the display of discounts next to search results and found that the change actually reduced the number of bookings and reduced user retention.
Through the tests, they found something very important: users trusted that they were impartial when returning results. As offers and discounts were added, users felt that the app lost its editorial integrity. The team took this view back to the drawing board and used it to perform another test, which increased the results by 28%.
While not all tests give you good results, the great benefit of taking the tests is that they teach you what works and what doesn’t, and help you better understand your users.
While A / B testing on mobile devices can be a powerful tool for optimizing applications, you want to make sure you and your team don’t fall victim to these common mistakes. Now that you are more aware, you can move forward with confidence and understand how you can use A / B testing to optimize your application and delight your customers.