What is Bayes' formula and what do you use it for?

By Indeed Editorial Team

Updated 30 September 2022

Published 9 May 2022

The Indeed Editorial Team comprises a diverse and talented team of writers, researchers and subject matter experts equipped with Indeed's data and insights to deliver useful tips to help guide your career journey.

In some professions and processes, such as machine learning, it's necessary to predict the probability of something happening to make optimal decisions. Bayes' formula is a method that professionals and AI can use to calculate probability. Understanding more about this formula can be beneficial, especially if you're interested in machine learning or building a career in statistics. In this article, we discuss what Bayes' formula is, why it's important, which areas use it and how to use it to calculate probability.

What is Bayes' formula?

Bayes' formula is a mathematical equation for predicting the probability of something occurring. The formula is significant within data science and machine learning and sometimes considered to be the most important formula for data scientists. The formula is a mathematical representation of Bayes' theorem. Bayes' theorem states that the conditional probability of one event, based on the occurrence of another event, is equal to the probability of the second event, given that the second event multiplies the probability of the first event. This is conditional probability.

The formula focuses on describing probability using existing knowledge about relevant conditions as a basis. It's expressed as P(A|B) = P(B|A) x P(A) / P(B). You express the final result as a percentage, which indicates the probability of an event happening.

Why is this formula significant?

Bayes' theorem can play an important part in developing more sophisticated machine learning and artificial intelligence. This type of technology is becoming increasingly prevalent in various aspects of life and is therefore of growing interest to professionals. Conditional probabilities are important for obtaining correct estimations and predictions in machine learning, and the formula can produce more accurate outcomes when it accounts for extra conditions. In the context of machine learning, this means adding more data.

A practical use of Bayes' theorem regarding machine learning involves the development of spam filters for an email client. The system can identify language patterns in emails that users mark as spam, for example, by identifying words like 'free'. By calculating the percentage of spam emails and the number of non-spam emails that include specific language, the system can use the formula to calculate the probability of emails containing certain words being spam.

Related: What does a machine learning engineer do? (With skills)

Which areas of work use this formula?

The formula's significance within machine learning means it can be relevant in a variety of industries and sectors. Artificial intelligence might use the formula, but sometimes professionals use it themselves to support their work. They might do this through the use of AI tools or by performing manual calculations. Industries frequently have different uses for the formula, such as:

Medicine and public health

Medical professionals and researchers managing clinical trials might use the formula to understand the likelihood of a disease or condition occurring. This can be useful for understanding the risk of someone developing a genetic condition, for example. It's also useful for predicting the spread of infectious diseases and public health outbreaks. This helps public health officials provide information on how to reduce the spread of disease. Another use of the formula within medicine is for pharmaceutical experts to predict how effective a treatment is likely to be for a particular condition. This helps to make medicines more effective.

Engineering

Structural engineers might use the formula to test ideas for structures or new processes. They can also use it to assess the probability of achieving the right results, given all of the conditions that influence the outcome. It can also be a useful tool for engineers to explore the implications of costs, safety requirements and other influences.

Finance

Financial specialists use the formula for making a variety of different predictions. These include predicting how the price of shares is likely to change, predicting which investments are likely to be the most beneficial and determining how costs might change. They can then give recommendations to clients or their employers about how to make the most effective use of financial resources and investments. Insurance specialists can also use the formula to predict the premiums and deductibles a client is subject to and give more accurate quotes.

Related: UK finance sector overview and industry guide (with FAQs)

Why are calculating probabilities useful?

Calculating probabilities can be useful in various areas of work and can influence how professionals proceed. For example, medical professionals might use probability calculations to make decisions about the most effective treatment plans for patients. Probabilities can also be useful in various areas of business, as they can help businesses and individual investors make decisions about how to invest. Probabilities can also help with developing strategies and product development. In the context of product development, developers can calculate the probability of a defect or issue occurring with a product and adjust the design to minimise this risk while maintaining quality.

Related: What is a business strategy (and business strategy examples)

Advantages of the formula

Using this formula to calculate probabilities has some useful advantages in business and data science. For example, the results are easy to interpret. The result is a percentage indicating how likely the event is to occur, so it's usually easy to assess whether there's a large or small chance of it happening. When you use the formula correctly, the results are usually an accurate indication.

Another advantage of the formula is that it has flexibility, especially in machine learning. If information changes, such as things like share prices often do, it's possible to adjust the formula accordingly. Incorporating more data as it becomes available can also make new calculations more accurate.

How to use the formula

When you use this formula, try to gather as much statistical information as possible. Doing this at the start of the process usually helps increase the accuracy of your predictions. Gathering the right information and following these steps can help you make useful probability predictions using the formula:

1. Write the formula with its variables

Beginning the process by writing out the formula with its variables gives you a structure for making your calculations. Here is the formula you'd use for your calculations:

P(A|B) = P(B|A) x P(A) / P(B)

Seeing the formula written down can also simplify the meaning of each variable, which makes it easier to complete your calculations. The meanings of each variable are:

  • P(A|B): This is the probability that event A occurs if event B occurs.

  • P(B|A): This is the probability that event B occurs if event A occurs.

  • P(A): This is the probability of event A occurring.

  • P(B): This is the probability of event B occurring.

2. Collect relevant information

You can then collect relevant information that might inform your calculations. Gathering as much relevant statistical information as possible can increase the accuracy of your conclusion. For example, this might be existing probability calculations for similar events. If you want to understand the probability of a company taking a particular action after its share price increases by 10%, you might choose to gather statistics about what other companies have done previously in the same situation. In this example:

  • P(A): This is the chance of share prices increasing by 10%.

  • P(B): This is the chance of the company taking a particular action.

  • P(A|B): This is the probability of share prices rising because the company takes the action.

  • P(B|A): This is the probability of the company taking the action because share prices rise.

Related: 19 jobs that use statistics (plus duties and salaries)

3. Change the variables in the formula

Once you have the right information, you can use this to change the variables in the formula. This gives you a final equation that you can calculate to obtain your result. It's essential that you carefully check the values as you add them to the formula. Mistakes at this stage mean your result is likely to be inaccurate. Having a clear understanding of what each variable in your formula represents can also improve accuracy.

4. Calculate the equation

Once you have a complete equation, you can then use it to calculate probability. You express the final as a percentage that shows the likelihood of an event occurring. You can then use the outcome to make recommendations. Companies might use these findings to adjust business strategies or make financial decisions.

Related:

  • What is conditional probability? (With uses and formula)

  • 9 machine learning interview questions and sample answers

  • What is non-probability sampling? (Uses, types and benefits)

  • The margin of safety formula (definition and how to use)

  • When to use the cost per lead formula (with definition)


Explore more articles