Header Ads Widget

MINIMUM DESCRIPTION LENGTH PRINCIPLE

  • A Bayesian perspective on Occam’s razor
  • Motivated by interpreting the definition of hMAP in the light of basic concepts from information theory.


which can be equivalently expressed in terms of maximizing the log
2

or alternatively, minimizing the negative of this quantity

This equation (1) can be interpreted as a statement that short hypotheses are preferred, assuming a particular representation scheme for encoding hypotheses and data

  • -log2P(h): the description length of h under the optimal encoding for the hypothesis space H, LCH (h) = −log2P(h), where CH is the optimal code for hypothesis space H.
  • -log2P(D | h): the description length of the training data D given hypothesis h, under the optimal encoding from the hypothesis space H: LCH (D|h) = −log2P(D| h) , where C D|h is the optimal code for describing data D assuming that both the sender and receiver know the hypothesis h.
  • Rewrite Equation (1) to show that hMAP is the hypothesis h that minimizes the sum given  by the description length of the hypothesis plus the description length of the data given  the hypothesis.


Where, CH and CD|h are the optimal encodings for H and for D given h


The Minimum Description Length (MDL) principle recommends choosing the hypothesis that   minimizes the sum of these two description lengths of equ.


Minimum Description Length principle:

Where, codes C1 and C2 to represent the hypothesis and the data given the hypothesis

The above analysis shows that if we choose C1 to be the optimal encoding of hypotheses CH, and if we choose C2 to be the optimal encoding CD|h, then hMDL = hMAP

Post a Comment

0 Comments