Header Ads Widget

Derivation of the Gradient Descent Rule

How to calculate the direction of steepest descent along the error surface?

The direction of steepest can be found by computing the derivative of E with respect to each component of the vector wvector . This vector derivative is called the gradient of E with respect to             wvector , written as

The gradient specifies the direction of steepest increase of E, the training rule for gradient descent is


  • Here η is a positive constant called the learning rate, which determines the step size in the gradient descent search.
  • The negative sign is present because we want to move the weight vector in the direction that decreases E.

This training rule can also be written in its component form 

Calculate the gradient at each step. The vector of 𝜕𝐸/𝜕𝑤𝑖 derivatives that form the gradient can be obtained by differentiating E from Equation (2), as

Post a Comment