Header Ads Widget

The Evolution of Probabilities Associated with Hypotheses

  • Figure (a) all hypotheses have the same probability.
  • Figures (b) and (c), As training data accumulates, the posterior probability for inconsistent hypotheses becomes zero while the total probability summing to 1 is shared equally among the remaining consistent hypotheses.

MAP Hypotheses and Consistent Learners

  • A learning algorithm is a consistent learner if it outputs a hypothesis that commits zero errors over the training examples.
  • Every consistent learner outputs a MAP hypothesis, if we assume a uniform prior probability distribution over H (P(hi) = P(hj) for all i, j), and deterministic, noise free training data (P(D|h) =1 if D and h are consistent, and 0 otherwise). 
Example:
  • FIND-S outputs a consistent hypothesis, it will output a MAP hypothesis under the probability distributions P(h) and P(D|h) defined above.
  • Are there other probability distributions for P(h) and P(D|h) under which FIND-S outputs MAP hypotheses? Yes.
  • Because FIND-S outputs a maximally specific hypothesis from the version space, its output hypothesis will be a MAP hypothesis relative to any prior probability distribution that favours more specific hypotheses.

 Note

  • Bayesian framework is a way to characterize the behaviour of learning algorithms
  • By identifying probability distributions P(h) and P(D|h) under which the output is a optimal hypothesis, implicit assumptions of the algorithm can be characterized (Inductive Bias)
  • Inductive inference is modelled by an equivalent probabilistic reasoning system based on Bayes theorem


Post a Comment

0 Comments