Header Ads Widget

BAYES THEOREM

Bayes theorem provides a way to calculate the probability of a hypothesis based on its prior probability, the probabilities of observing various data given the hypothesis, and the observed data itself.

Notations

  • P(h) prior probability of h, reflects any background knowledge about the chance that h is correct
  • P(D) prior probability of D, probability that D will be observed
  • P(D|h) probability of observing D given a world in which h holds
  • P(h|D) posterior probability of h, reflects confidence that h holds after D has been observed

Bayes theorem is the cornerstone of Bayesian learning methods because it provides a way to calculate the posterior probability P(h|D), from the prior probability P(h), together with P(D) and P(D|h).

  • P(h|D) increases with P(h) and with P(D|h) according to Bayes theorem.
  • P(h|D) decreases as P(D) increases, because the more probable it is that D will be observed independent of h, the less evidence D provides in support of h.

Maximum a Posteriori (MAP) Hypothesis

  • In many learning scenarios, the learner considers some set of candidate hypotheses H and is interested in finding the most probable hypothesis h H given the observed data D. Any such maximally probable hypothesis is called a maximum a posteriori (MAP) hypothesis.
  • Bayes theorem to calculate the posterior probability of each candidate hypothesis is hMAP is a MAP hypothesis provided

  • P(D) can be dropped, because it is a constant independent of h

Maximum Likelihood (ML) Hypothesis

  • In some cases, it is assumed that every hypothesis in H is equally probable a priori (P(hi) = P(hj) for all hi and hj in H).
  • In this case the below equation can be simplified and need only consider the term P(D|h) to find the most probable hypothesis.

P(D|h) is often called the likelihood of the data D given h, and any hypothesis that maximizes P(D|h) is called a maximum likelihood (ML) hypothesis

Example

  • Consider a medical diagnosis problem in which there are two alternative hypotheses:   (1) that the patient has particular form of cancer, and (2) that the patient does not. The available data is from a particular laboratory test with two possible outcomes: + (positive) and - (negative).
  • We have prior knowledge that over the entire population of people only .008 have this disease. Furthermore, the lab test is only an imperfect indicator of the disease.
  • The test returns a correct positive result in only 98% of the cases in which the disease is actually present and a correct negative result in only 97% of the cases in which the disease is not present. In other cases, the test returns the opposite result.
  • The above situation can be summarized by the following probabilities:

Suppose a new patient is observed for whom the lab test returns a positive (+) result. Should we diagnose the patient as having cancer or not? 

The exact posterior probabilities can also be determined by normalizing the above quantities so that they sum to 1


Basic formulas for calculating probabilities are summarized in Table 




Post a Comment

0 Comments