Bayes’ Theorem (Part 1)

Essay, Pages 4 (1000 words)

Views

Bayes’ Theorem (Part 1)

Bayes’ theorem is a formula used to calculate the probability of a hypothesis, based on prior knowledge and new evidence.

It states that when data is assumed to be generated by a given model, the posterior probability of the model given the data is equal to the product of the prior probability of the model and the conditional probability of the data given the model.

In other words, if we have some knowledge about something (the prior), then we can make inferences about how likely other things are (the posterior).

Don't use plagiarized sources. Get your custom essay on

“ Bayes’ Theorem (Part 1) ”

Get custom paper

NEW! smart matching with writer

Now we come to one of the most famous and powerful results in probability theory, Bayes' Theorem. The theorem starts with the familiar product rule.

If we multiply both sides of this equation by the probability of B, we get something that's equivalent but, due to the product rule, looks like this. The conditional probability of A given B.

If the marginal probability of B equals the joint probability A and B, then.

Now, we can substitute the probability of A and B for the probability of A and B because they are equivalent. Then using the product rule again, we will apply it to B and A in the form that we've written it above.

The probability of A given B is equal to the probability of (A, B) times the probability of A.

Bayes' Theorem is defined by the equation where:

One of the most powerful uses of Bayes' theorem is for inverse probability problems.

Inverse probability problems are those where the answer is in the form of the probability that a certain process is being used to generate the observed data.

However, other times we use the Greek letter theta to represent different parameters that might be causing the data. We sometimes use AI to generate the data we observed. They are both used to indicate one of the possible processes that could have generated the observed data.

Here is an example.

Suppose we have two urns: one with 20 percent white marbles and one with 10 percent white marbles. Suppose we observe three white marbles in a row being drawn with replacement from one or other of the urns.

We do not know which urn we are observing, but we can calculate the probability of observing Urn 1 and the probability of observing Urn 2. These are the probabilities of the process Urn 1 or the parameter white, probability of white = 0.2. Urn 2 is the process Urn 2 or the parameter that probability of white on each individual draw is equal to 0.1.

In a standard forward probability problem, we are interested in the probability of a certain observed outcome given a known process. For example, if we know that Urn 1 contains three white marbles in a row, that's a known process. Then the probability of observing three white marbles in a row is 0.2 times 0.2 times 0.2 or 8 one-thousandths. This would be a conventional probability problem.

In this problem, we are given the outcome of an experiment and are interested in how likely it is that we observed Urn 1 or Urn 2, given an unknown process. The way the terminology of conditional probability can be written.

We are interested in the probability of the process parameter being 0.2 or 0.. Urn 1, urn 2. Given that three white marbles have been observed in a row, these two values are equivalent.

Bayes's theorem tells us that this is equal to, if this is the probability of Ai given B and Ai given B, then that is equal to the probability of B given Ai times the probability of Ai divided by the probability of B.

We can use the sum rule to break up the joint probability of two independent processes into their individual probabilities. To do so, we add up the probabilities of each process and then multiply by their original probabilities. Parameter 1 and parameter 2.

Part of Bayes' theorem is the particular equation, what is the probability of an event occurring given prior knowledge about a parameter?

Those are the observation, A1, A2. The likelihood portion of the solution to Bayes's theorem is known as follows.

Solving for the probability of observing three whites in a row when the sample space is defined by r in 1 being 8 in 1000.

In the same way, if each individual white marble has a 10 percent probability of being drawn with replacement, then the probability of three whites in a row is 1 in 1,000.

The solution to this problem is usually the likelihood, the probability of data given a parameter. What do we take for solution?

The principle of indifference tells us that we have no basis for choosing between the two urns. Therefore, P(A1) = .5, and P(A2) = .5.

Before any data are observed, we assume that the probabilities of different parameters are equal, or that they have equal likelihoods. This is called the prior probability of the parameters, or of the process. So, we'll examine the probability P(B given A1)P(A1)/P(B). That is equal to P(B, A1) by the sum rule and by the product rule it's equal to P(B given A1)P(A1) + P(B given A2)P(A2).

So all of this is equal to the probability of B, okay? So we can see that Bayes' Theorem is applicable here. We can also calculate the probability of A2 by setting this equation equal to 1. However, because there are only two possibilities, they must sum to 1.

If we solve one equation, we've also solved the other.

So P(B|A) as we just said a minute ago is 8/1,000. And that's times 1/2 divided by 8 times 1000 times 1/2. And the probability of observing three white marbles given a 10% chance of drawing a white marble from Urn 2 is 1/1000 times 1/2. We cancel out the 1/2s. And we have 8/9. That is the probability that we observed Urn 1.

Obviously, if we observe urn 2 and it contains two black balls, the probability that urn 1 also contains two black balls is 1 minus 8/9 or 1/9. This is an application of Bayes' theorem using inverse probability.