What is a p-value? (With definition and examples)

By Indeed Editorial Team

Published 14 June 2022

The Indeed Editorial Team comprises a diverse and talented team of writers, researchers and subject matter experts equipped with Indeed's data and insights to deliver useful tips to help guide your career journey.

Statistics, data analysts and other data scientists use the concept of statistical significance to make decisions about the importance of a sample of data. The probability value, also known as the p-value, is a statistical significance indicator many experts utilise when publishing their studies because it enables others to draw inferences from the findings. Understanding a probability value, applying it, and calculating it can help you form hypotheses based on it. In this article, we define probability value, discuss its uses, describe how to calculate it and present an example.

What's a p-value?

The likelihood of an extreme outcome happening is represented by the p-value, which is a statistical indicator in statistics. If the null hypothesis is right, this outcome is at least as severe as an observed result by chance, assuming that the null hypothesis is correct. When it comes to statistics, hypothesis testing is a means to assess the significance of a single data point or group of data.

It's important to note that the probability value is a measurement based on the assumption that the null hypothesis is valid. If the value is small, it's possible to reject it and accept the alternative hypothesis. A big probability value indicates that the data point or set you examined is consistent with the null hypothesis, making it the more probable result to occur. When you publish research, you may include a probability value as a measurement to assist readers in evaluating the results for themselves.

Formulating a null hypothesis

To formulate a null hypothesis, you first anticipate that there is nothing statistically important between an observed result and the data set from which it comes. If the average body temperature of groups A and B is the same, in which case the null hypothesis would state that there is no statistical significance between the mean body temperatures of groups A and B could be developed to test. A finding is considered significant when it's very improbable that it would have happened if the null hypothesis were to be true.

When there is a statistical difference between an observed result and the set of data to which it belongs, this sort of hypothesis is referred to as rejecting a null hypothesis, which means that your test rejects the null hypothesis you formed. If your findings support an alternative hypothesis claiming that there is a difference between the average temperature levels of groups A and B, you might write a hypothesis indicating that there is no difference between groups A and B.

Related: How to calculate statistical significance (with formulas)

What are the uses of a probability value?

Statisticians, data analysts and companies use the probability value to assess how much a given data point crosses the boundaries of a data collection. This information can be extremely helpful when deciding if a data point is a useful metric for enhancing output and profits for companies. It can also help decide whether data is statistically significant for data analysts and whether a data point is appropriate for other statistical measures. You may use two sorts of probability values: the p-value and the t-value.

This kind of testing is used in situations when there is a substantial or unplanned change in the data, but the change is of little or no significance to your data collection. This is rare, and a two-sided probability value test may be used in place. Using the two-sided probability value approach, you can determine whether a significant change in the data would have an impact on the conclusion of the study and if an alternative explanation is reasonably broad rather than particular. The majority of professionals use this strategy to guarantee that they account for significant changes.

How to calculate the probability value

Learning how to calculate the probability value can help you determine if the null hypothesis may be rejected. The following are some steps you may take to assist you in calculating the probability value for a data sample:

1. State the null and alternative hypotheses in detail

The first step in determining the probability value of a sample is to examine your data and develop both a null and alternative hypothesis for the situation. For example, you can claim that a hypothesised 'mean 10' is equal to '10' and that the alternative hypothesis is that the hypothesised 'mean 10' does not equal '10' as a result of your statement.

Related: How much does a data analyst make? (With qualifications)

2. Use the formula to calculate the t-test

You may compute the value of your test statistic ('t') depending on your data set after you've verified the truth of both of your hypotheses. This statistic may be calculated using the following formula:

Test statistic = (sample mean - sample size) / (standard deviation / sample size)

In mathematics, the standard deviation is a measure of the variability in a collection of data points. It may also assist you in determining how near a data point is to the mean when compared to other data points in the sample.

3. Use a t-distribution table to get the probability value for the linked variable

Once you have computed the value of the test statistic, you can calculate the corresponding probability value by consulting the t-distribution table. The probability value employs three main significance values: 0.01, 0.05, and 0.1, which are the primary significance values on the t-distribution table. These numbers indicate how near a theory is to a data set in terms of accuracy.

To make use of the t-distribution table, choose which of the significant values your data falls into. This may be accomplished by taking your sample size 'n' and removing one from it. Then you may use the significance value that you selected to locate the value that corresponds to it in the table. If your data is distributed in a single-tailed manner, this number represents the probability value of your data. It's possible to get your probability value by multiplying this number by two if your distribution is a two-tailed distribution, which is more frequent.

Related: How to become a biostatistician (with skills and FAQs)

Is a probability value of 0.05 significant?

Statisticians interpret a probability value of less than 0.05 as significant. For this reason, the null hypothesis can be rejected. In opposition, if the probability value is bigger than 0.05, you don't reject the null hypothesis.

What's the meaning of a 0.001 probability value?

Considering the null hypothesis to be true, a probability value of 0.001 suggests that there is a one in 1,000 chance of outcomes at least as extreme. This means that the statistician may reject the null hypothesis for one of two reasons: they observed an extremely unusual data result or the null hypothesis was false.

Example of a probability value

Here's an example of a probability value being used in a business scenario:

An investor asserts that the performance of their investment portfolio is comparable to that of the S&P 500 Index (stock market index). The investor uses a two-tailed test to assess whether or not to proceed. In a one-tailed test, the null hypothesis states that a portfolio's returns are equal to or greater than the returns of the S&P 500 over a specified period. The alternative hypothesis states that the portfolio's returns and the S&P 500's returns are not equal.

If the investor conducted a one-tailed test, the alternative hypothesis would state that the portfolio's returns are either less than or greater than the S&P 500's returns. The probability value hypothesis test measures the amount of evidence available to support rejecting the null hypothesis. The lower the probability value, the more evidence there is against the null hypothesis. As a result, if the investor discovers that the probability value is less than 0.001, there is compelling evidence against the null hypothesis. The investor may safely infer that the portfolio's returns and the S&P 500 are not comparable.

Explore more articles