Datatron Blog

Stay Current with AI/ML

Blog Category

A Simple Guide to A/B Testing for Data Science

ab testing
Picture created by myself, Terence Shin, and Freepik

Check out my article ‘Hypothesis Testing Explained as Simply as Possible’ if you don’t already know what hypothesis testing is first!

A/B testing is one of the most important concepts in data science and in the tech world in general because it is one of the most effective methods in making conclusions about any hypothesis one may have. It’s important that you understand what A/B testing is and how it generally works.

Table of Contents

  1. What is A/B testing?
  2. Why is it important to know?
  3. How to conduct a standard A/B test

What is A/B testing?

A/B testing in its simplest sense is an experiment on two variants to see which performs better based on a given metric. Typically, two consumer groups are exposed to two different versions of the same thing to see if there is a significant difference in metrics like sessions, click-through rate, and/or conversions.

Using the visual above as an example, we could randomly split our customer base into two groups, a control group and a variant group. Then, we can expose our variant group with a red website banner and see if we get a significant increase in conversions. It’s important to note that all other variables need to be held constant when performing an A/B test.

Getting more technical, A/B testing is a form of statistical and two-sample hypothesis testing. Statistical hypothesis testing is a method in which a sample dataset is compared against the population data. Two-sample hypothesis testing is a method in determining whether the differences between the two samples are statistically significant or not.

Why is it important to know?

It’s important to know what A/B testing is and how it works because it’s the best method in quantifying changes in a product or changes in a marketing strategy. And this is becoming increasingly important in a data-driven world where business decisions need to be back by facts and numbers.

How to conduct a standard A/B test

1. Formulate your hypothesis

Before conducting an A/B testing, you want to state your null hypothesis and alternative hypothesis:

The null hypothesis is one that states that sample observations result purely from chance. From an A/B test perspective, the null hypothesis states that there is no difference between the control and variant group.

The alternative hypothesis is one that states that sample observations are influenced by some non-random cause. From an A/B test perspective, the alternative hypothesis states that there is a difference between the control and variant group.

When developing your null and alternative hypotheses, it’s recommended that you follow a PICOT format. Picot stands for:

  • Population: the group of people that participate in the experiment
  • Intervention: refers to the new variant in the study
  • Comparison: refers to what you plan on using as a reference group to compare against your intervention
  • Outcome: represents what result you plan on measuring
  • Time: refers to the duration of the experience (when and how long the data is collected)

Example: “Intervention A will improve anxiety (as measured by the mean change from baseline in the HADS anxiety subscale) in cancer patients with clinical levels of anxiety at 3 months compared to the control intervention.”

Does it follow the PICOT criteria?

  • Population: Cancer patients with clinical levels of anxiety
  • Intervention: Intervention A
  • Comparison: the control intervention
  • Outcome: improve anxiety as measured by the mean change from baseline in the HADS anxiety subscale
  • Time: at 3 months compared to the control intervention.

Yes it does — therefore, this is an example of a strong hypothesis test.

2. Create your control group and test group

Once you determine your null and alternative hypothesis, the next step is to create your control and test (variant) group. There are two important concepts to consider in this step, random samplings and sample size.

Random Sampling
Random sampling is a technique where each sample in a population has an equal chance of being chosen. Random sampling is important in hypothesis testing because it eliminates sampling bias, and it’s important to eliminate bias because you want the results of your A/B test to be representative of the entire population rather than the sample itself.

Sample Size
It’s essential that you determine the minimum sample size for your A/B test prior to conducting it so that you can eliminate under coverage bias, bias from sampling too few observations. There are plenty of online calculators that you can use to calculate the sample size given these three inputs.

3. Conduct the test, compare the results, and reject or do not reject the null hypothesis


Once you conduct your experiment and collect your data, you want to determine if the difference between your control group and variant group is statistically significant. There are a few steps in determining this:

  • First, you want to set your alpha, the probability of making a type 1 error. Typically the alpha is set at 5% or 0.05
  • Next, you want to determine the probability value (p-value) by first calculating the t-statistic using the formula above.
  • Lastly, compare the p-value to the alpha. If the p-value is greater than the alpha, do not reject the null!

If this doesn’t make sense to you, I would take the time to learn more about hypothesis testing here!

For more articles like this one, check out

Thanks for reading!

Here at Datatron, we offer a platform to govern and manage all of your Machine Learning, Artificial Intelligence, and Data Science Models in Production. Additionally, we help you automate, optimize, and accelerate your ML models to ensure they are running smoothly and efficiently in production — To learn more about our services be sure to Book a Demo.


MLOps Maturity Model [M3]

MLOps Maturity Model Infographic Thumbnail

In this Infographic, you’ll learn:

  • The FIVE stages of maturity in Machine Learning Operations, i.e., MLOps
  • Why DevOps is not the same for ML as it is for software, and why MLOps is needed
  • The ideal teams, stacks, and features to look for to reach Maturity in your ML program

Learn why some companies succeed, while others struggle in AI/ML by seeing the signatures of success across Ideation, Team, Stack, Process, & Outcome in this informative (Hi-res) Infographic.

Infographic: MLOps Maturity Model [M3]


Datatron 3.0 Product Release – Enterprise Feature Enhancements

Streamlined features that improve operational workflows, enforce enterprise-grade security, and simplify troubleshooting.

Get Whitepaper


Datatron 3.0 Product Release – Simplified Kubernetes Management

Eliminate the complexities of Kubernetes management and deploy new virtual private cloud environments in just a few clicks.

Get Whitepaper


Datatron 3.0 Product Release – JupyterHub Integration

Datatron continues to lead the way with simplifying data scientist workflows and delivering value from AI/ML with the new JupyterHub integration as part of the “Datatron 3.0” product release.

Get Whitepaper


Success Story: Global Bank Monitors 1,000’s of Models On Datatron

A top global bank was looking for an AI Governance platform and discovered so much more. With Datatron, executives can now easily monitor the “Health” of thousands of models, data scientists decreased the time required to identify issues with models and uncover the root cause by 65%, and each BU decreased their audit reporting time by 65%.

Get Whitepaper


Success Story: Domino’s 10x Model Deployment Velocity

Domino’s was looking for an AI Governance platform and discovered so much more. With Datatron, Domino’s accelerated model deployment 10x, and achieved 80% more risk-free model deployments, all while giving executives a global view of models and helping them to understand the KPI metrics achieved to increase ROI.

Get Whitepaper


5 Reasons Your AI/ML Models are Stuck in the Lab

AI/ML Executive need more ROI from AI/ML? Data Scientist want to get more models into production? ML DevOps Engineer/IT want an easier way to manage multiple models. Learn how enterprises with mature AI/ML programs overcome obstacles to operationalize more models with greater ease and less manpower.

Get Whitepaper