When should I use a Z-test vs a T-test?

Use a Z-test if you know the standard deviation and have a large sample size.

What are the common applications of Z-tests and T-tests in data science?

Common applications of Z-tests and T-tests include hypothesis testing procedures, A/B testing, performance evaluation, data analysis and quality control.

When should I use one-sample or two-sample tests?

Use a one-sample test when comparing an increase or decrease between a sample mean and a population mean. Use a two-sample test when comparing differences between the means of two groups against the population mean.

Are T-tests and Z-tests used for similar purposes?

Yes, both tests assess differences between groups, though they are used when there are different conditions related to statistical power, effect size, and variability.

blog/tech matters/tips & tools/Z-test vs T-test: the differences and when to use each

Z-test vs T-test: the differences and when to use each

contents

Share Links

Facebook Twitter LinkedIn

Send via email

11 min read

updated 23 Apr 2024

two light bulbs with inscriptions Z-test vs T-test

written by

The EPAM Anywhere Editorial Team

The EPAM Anywhere Editorial Team is an international collective of senior software engineers, managers and communications professionals who create, review and share their insights on technology, career, remote work, and the daily life here at Anywhere.

let's find your best tech job at Anywhere

find me a job

contents

Testing is how you determine effectiveness. Whether you work as a data scientist, statistician, or software developer, to ensure quality, you must measure performance. Without tests, you could deploy flawed code, features, or data points.

With that in mind, the use cases of testing are endless. Machine learning models need statistical tests. Data analysis involves statistical tests to validate assumptions. Optimization of any kind requires evaluation. You even need to test the strength of your hypothesis before you begin an inquiry.

Let's explore two inferential statistics: the Z-test vs the T-test. That way you can understand their differences, their unique purposes, and when to use a Z-test vs T-test.

What is hypothesis testing?

To start, imagine you have a good idea. At the moment of inception, you have no data to back up your idea. It is an unformed thought. But the idea is an excellent starting point that can launch a full investigation. We consider this starting point a hypothesis.

But what if your hypothesis is off-base? You don’t want to dive into a full-scale search if it is a pointless chase with no reward. That is a waste of resources. You need to determine if you have a workable hypothesis.

Enter hypothesis testing. It is a statistical act used to assess the viability of a hypothesis. The method discovers whether there is sufficient data to support your idea. If there is next to no significance, you do not have a very plausible hypothesis.

To confirm the validity of a hypothesis, you compare it against the status quo (also known as the null hypothesis). Your idea is something new, opposite from normal conditions (also known as the alternative hypothesis). It is zero sum: only one hypothesis between the null and alternate hypothesis can be true in a given set of parameters.

In such a comparison test, you can now determine validity. You can compare and contrast conditions to find meaningful conclusions. Whichever conditions become statistically apparent determines which hypothesis is plausible.

looking for a data scientist job?

Find your perfect remote-forever full-time job as a data scientist at EPAM Anywhere

explore jobs

What is a Z-test?

A Z-test is a test statistic. It works with two groups: the sample mean and the population mean. It will test whether these two groups are different.

With a Z-test, you know the population standard deviation. That is to ensure statistical accuracy as you compare one group (the sample mean) vs the second group (the population mean). In other words, you can minimize external confounding factors with a normal distribution. In addition, a defining characteristic of a Z-test is that it works with large sample sizes (typically more than 30, so we achieve normal distribution as defined by the central limit theorem). These are two crucial criteria for using a Z-test.

Within hypothesis testing, your null hypothesis states there is no difference between the two groups your Z-test will compare. Your alternative hypothesis will state there is a difference that your Z-test will expose.

How to perform a Z-test

A Z-test occurs in the following standard format:

Formulate your hypothesis: First, define the parameters of your alternative and null hypothesis.
Choose a critical value: Second, determine what you consider a viable difference between your two groups. This threshold determines when you can say the null hypothesis should be rejected. Common levels are 0.01 (1%) or 0.05 (5%), values found to best balance Type I and Type II errors.
Collect samples: Obtain the needed data. The data must be large enough and random.
Calculate for a Z-score: Input your data into the standard Z-test statistics formula, shown below, where Z = standard score, x = observed value, mu = mean of the sample, sigma = standard deviation of the sample.

Z-score calculation in Z-test vs T-test comparison

Compare: If the statistical test is greater than the critical value, you have achieved statistical significance. The sample mean is so different so you can reject the null hypothesis. Your alternative hypothesis (something other than the status quo) is at work, and that's worth investigating.

Examples of a Z-test

There are different variations of a Z-test. Let's explore examples of one-sample and two-sample Z-tests.

One-sample Z-test

A one-sample Z-test looks for either an increase or a decrease. There is one sample group involved, taken from a population. We want to see if there is a difference between those two means.

For example, consider a school principal who believes their students' IQ is higher than the state average. The state average is 100 points (population mean), give or take 20 (the population standard deviation). To prove this hypothesis, the principal takes 50 students (the sample size) and finds their IQ scores. To their delight, they earn an average of 110.

But does the difference offer any statistical value? The principal then plugs the numbers into a Z-test. Any Z-score greater than the critical value would state there is sufficient significance. The claim that the students have an above-average IQ is valid.

Two-sample Z-test

A two-sample test involves comparing the average of two sample groups against the population means. It is to determine a difference between two independent samples.

For example, our principal wants to compare their students' IQ scores to the school across the street. They believe their students' average IQ is higher. They don’t need to know the exact numerical increase or decrease. All they want is proof that their student's average scores are higher than the other group.

To confirm the validity of this hypothesis, the principal will search for statistical significance. They can take a 50-student sample size from their school and a 50-student sample size from the rival school. Now in possession of both sample group's average IQ (and the sample standard deviation), they hope to find a number value that is not equal. And they need them to be unequal by a significant amount.

If the test statistic comes in less than the critical value, the differences are negligible. There is not enough evidence to say the hypothesis is worth exploring, the null hypothesis is maintained. He would not have enough proof that the IQ levels between the two schools are different.

step-by-step guide to creating, building, and showcasing your data analyst portfolio projects

Read full story

What is a T-test?

A T-test performs the same crucial function as a Z-test: determine if there is a difference between the means of two groups. If there is a significant difference, you have achieved statistical validity for your hypothesis.

However, a T-test involves a different set of factors. Most importantly, a T-test applies when you do not know the sample variance of your values. You must generalize the normal distribution (or T-distribution). Plus, there is an expectation that you do not possess all the data in a given scenario.

These conditions better match reality, as it is often hard to collect data from entire populations or always obtain a standard normal distribution. That is why T-tests are more widely applicable than Z-tests, though they operate with less precision.

How to perform a T-test

A T-test occurs in the following standard format:

Formulate your hypothesis: First, define the parameters of your null and alternative hypothesis.
Choose a critical value: Like a Z-test, determine what you consider a viable difference between your two groups.
Collect data: Obtain the needed data. One of the key differences is degrees of freedom in the samples of a T-test, so try to define the typical values and range of values in each group.
Calculate your T-score: Input your data into the T-test formula you chose. Here is a one-sample formula:

Compare: If the statistical test is greater than the critical value, you have achieved statistical significance. The sample mean is so far from the population mean that you likely have a useful hypothesis.

Examples of a T-test

There are several different kinds of T-tests as well. Let's go through the standard one-sample and two-sample T-tests.

One-sample T-test

A one-sample T-test looks for an increase or decrease compared to a population mean.

For example, your company just went through sales training. Now, the manager wants to know if the training helped improve sales.

Previous company sales data shows an average of $100 on each transaction from all workers. The null hypothesis would be no change. The alternative hypothesis (which you hope is significant), is that there is an improvement.

To test if there is significance, you take the sales average of 20 salesmen. That is the only available data, and you have no other data from nationwide stores. The average of that sample of salesmen in the past month is $130. We will also assume that the standard deviation is approximately equal.

With this set of factors, you can calculate your T-score with a T-test. You compare the sample result to the critical value. In addition, you assess it against the number of degrees of freedom. Since we know with smaller sample data sizes there is greater uncertainty, we allow more room for our data to vary.

After comparing, we may find a lot of significance. That means the data possesses enough strength to support our hypothesis that sales training likely impacted sales. Of course, this is an estimate, as we only assessed one factor with a small group. Sales could have risen for numerous other reasons. But with our set of assumptions, our hypothesis is valid.

Two-sample T-test

A two-sample T-test occurs the same as a two-sample Z-test and compares if two groups are equal when compared to a defined population parameter.

For example, consider English and non-native speakers. We want to see the effect of maternal language on test scores inside a country. To do that, we will offer both groups a reading test and compare those scores to the average.

Of course, finding the mean of an entire population of language speakers is impossible to procure. Still, we can make some assumptions and compare them with a smaller size. We take 15 English speakers and 15 non-native speakers and collect their results. We can decide on a critical score value on the reading test as well. If the average score on the test is not crucially different or outside the population standard deviation, our assumption failed. There is no significant difference between the groups, so the impact of maternal language is not worth investigating.

How to know when to use Z-test vs T-test

Both a Z-test and a T-test validate a hypothesis. Both are parametric tests that rely on assumptions. The key difference between Z-test and T-test is in their assumptions (e.g. population variance).

Key differences about the data used result in different applications. You want to use the appropriate tool, otherwise you won’t draw valid conclusions from your data.

So when should you use a Z-test vs a T-test? Here are some factors to consider:

Sample size: If the available sample size is small, opt for a T-test. Small sample sizes are more variable, so the greater spread of distribution and estimation of error involved with T-tests is ideal.
Knowledge of the population standard deviation: Z-tests are more precise and often simpler to execute. So if you know the standard deviation, use a Z-test.
Test purpose: If you are assessing the validity of a mean, a T-test is the best choice. If you are working with a hypothesized population proportion, go for a Z-test.
Assumption of normality: A Z-test assumes a normal distribution. This does not apply to all real-world scenarios. If you hope to validate a hypothesis that is not well-defined, opt for a T-test instead.
Type of data: You can only work within the constraints of the available data. The more information the better, but that is often not possible given testing and collecting conditions. If you have limited data describing means between groups, opt for a T-test. If you have large data sets comparing means between populations, you can use a Z-test.

Difference between Z-test and T-test: a comparative table

Knowing the key differences with each statistical test makes selecting the right tool far easier. Here is a table that can help you compare:

	T-test	Z-test
1. Purpose	Compare means of small samples (n < 30)	Compare means of large samples (n ≥ 30)
2. Assumptions	Normally distributed data, approximate normality	Normally distributed data, known population standard deviation
3. Population standard deviation	Unknown	Known
4. Sample size	Small (n < 30)	Large (n ≥ 30)
5. Test statistic	T-distribution	Standard normal distribution (Z-distribution)
6. Degrees of freedom	n1 + n2 - 2	Not applicable
7. Use case	Small sample analysis, comparing means between groups	Large sample analysis, population mean comparisons
8. One-sample vs. two-sample	Both	Usually two-sample
9. Data requirement	Raw data	Raw data
10. Complexity	Relatively more complex	Relatively simpler

Conclusion

Statistical testing lets you determine the validity of a hypothesis. You discover validity by determining if there is a significant difference between your hypothesis and the status quo. If there is, you have a possible idea worth exploring.

That process has numerous applications in the field of computer science and data analysis. You might want to determine the performance of an app with an A/B test. Or you might need to test if an application fits within the defined limits and compare performance metrics. Z-tests and T-tests can depict whether there is significant evidence in each of these scenarios. With that information, you can take the appropriate measures to fix bugs or optimize processes.

Z-test and T-test are helpful tools, especially for hypothesis testing. For data engineers of the future, knowledge of statistical testing will only help your work and overall career trajectory.

Are you a data scientist looking for a job? Check out our remote data scientist jobs available.

FAQ

tech matters/tips & tools

updated 23 Apr 2024

Facebook LinkedIn Twitter Send via email

written by

The EPAM Anywhere Editorial Team

our editorial policy

Explore our Editorial Policy to learn more about our standards for content creation.

step-by-step guide to creating, building, and showcasing your data analyst portfolio projects

A data analyst portfolio allows you to showcase your data processing, analysis, visualization, and narrative skills. Learn how to use these data analyst portfolio projects templates to find your ideal content and structure.

tips & tools

5 myths about solution architects dispelled

Our understanding of our dream job may have little to do with reality. If solution architecture is something you really love, check the most widespread myths about SA busted.

tips & tools

R vs Python in data science and machine learning

Find out key differences between R and Python to make a confident choice for your machine learning or data science project.

tips & tools

red team vs blue team in cybersecurity: key roles & responsibilities

Discover the roles & responsibilities of red and blue teams in cybersecurity and set up a perfect team composition for your project.

tips & tools

shift-left strategy in accessible product development

In this post, our accessibility experts share tips and tricks for creating an effective development process and accessible product with a shift-left strategy in mind.

tips & tools

what are third-party cookies? why privacy on the web is an illusion

In this post, Konstantin, Chief Software Engineer II, and Vladimir, Senior Data Scientist, discuss today's privacy on the web, future cookie deprecation, mechanisms behind user tracking, and fingerprinting.

tips & tools

C# introduction: basics, introduction, and beginner projects

Dive into C# programming with this guide covering basics, essential concepts, and beginner projects to embark on your coding journey confidently.

tips & tools

want to boost your growth as a software engineer? try coding stories

Explore step-by-step guides for software developers on the key tips for writing and refactoring code on Coding Stories - an open source platform powered by EPAM.

latest blog posts

career advice

how to become a senior software engineer: a guide

Learn how to grow to a senior software engineer role, how long does it take, and whether you have the skills to become one.

career advice

creative Salesforce portfolio project ideas

Explore our dynamic collection of Salesforce portfolio projects and get inspiration from ideas to win your next top developer job.

tips & tools

rebase vs. merge: a comprehensive guide

Understand the techniques of rebase vs merge, discover the pros, cons, and when best to use git rebase vs git merge in this comprehensive guide.

tips & tools

the future of Agile: what to expect in 2024 and beyond

Discover the agile trends driving innovation in 2024 and beyond with applications for both IT and non-IT teams.

career advice

best React projects for a portfolio: from ideas to standout examples

Discover best React projects for portfolio enhancement and get inspired with project ideas to craft your shining portfolio.

job interviews

Ruby on Rails interview questions

Ace your next Ruby on Rails developer interview with our comprehensive guide of top 40 questions and answers. Prepare to land your dream job!

job interviews

data scientist interview questions and answers

Explore our guide on the top 50 data scientist interview questions and answers to ace your next data science job interview.

career advice

24 unique JavaScript projects for portfolio

Explore unique JavaScript project ideas for all skill levels. Boost your portfolio and showcase your coding expertise with these practical JS projects.