Wednesday, June 23, 2021

Home Machine Learning Machine Learning Unit-3 REMARKS ON THE BACKPROPAGATION ALGORITHM

REMARKS ON THE BACKPROPAGATION ALGORITHM

Rahul Saini June 23, 2021 ,Machine Learning ,Machine Learning Unit-3

1. Convergence and Local Minima

The BACKPROPAGATION multilayer networks is only guaranteed to converge toward some local minimum in E and not necessarily to the global minimum error.
Despite the lack of assured convergence to the global minimum error, BACKPROPAGATION is a highly effective function approximation method in practice.
Local minima can be gained by considering the manner in which network weights evolve as the number of training iterations increases.

Common heuristics to attempt to alleviate the problem of local minima include:

Add a momentum term to the weight-update rule. Momentum can sometimes carry the gradient descent procedure through narrow local minima
Use stochastic gradient descent rather than true gradient descent
Train multiple networks using the same data, but initializing each network with different random weights

2. Representational Power of Feedforward Networks

What set of functions can be represented by feed-forward networks?

The answer depends on the width and depth of the networks. There are three quite general results are known about which function classes can be described by which types of Networks

Boolean functions – Every boolean function can be represented exactly by some network with two layers of units, although the number of hidden units required grows exponentially in the worst case with the number of network inputs
Continuous functions – Every bounded continuous function can be approximated with arbitrarily small error by a network with two layers of units
Arbitrary functions – Any function can be approximated to arbitrary accuracy by a network with three layers of units.

3. Hypothesis Space Search and Inductive Bias

Hypothesis space is the n-dimensional Euclidean space of the n network weights and hypothesis space is continuous.
As it is continuous, E is differentiable with respect to the continuous parameters of the hypothesis, results in a well-defined error gradient that provides a very useful structure for organizing the search for the best hypothesis.
It is difficult to characterize precisely the inductive bias of BACKPROPAGATION algorithm, because it depends on the interplay between the gradient descent search and the way in which the weight space spans the space of representable functions. However, one can roughly characterize it as smooth interpolation between data points.

4. Hidden Layer Representations

BACKPROPAGATION can define new hidden layer features that are not explicit in the input representation, but which capture properties of the input instances that are most relevant to learning the target function.

Consider example, the network shown in below Figure

Consider training the network shown in Figure to learn the simple target function f(x)=x, where x is a vector containing seven 0's and a single 1.
The network must learn to reproduce the eight inputs at the corresponding eight output units. Although this is a simple function, the network in this case is constrained to use only three hidden units. Therefore, the essential information from all eight input units must be captured by the three learned hidden units.
When BACKPROPAGATION applied to this task, using each of the eight possible vectors as training examples, it successfully learns the target function. By examining the hidden unit values generated by the learned network for each of the eight possible input vectors, it is easy to see that the learned encoding is similar to the familiar standard binary encoding of eight values using three bits (e.g., 000,001,010,. . . , 111). The exact values of the hidden units for one typical run of shown in Figure.
This ability of multilayer networks to automatically discover useful representations at the hidden layers is a key feature of ANN learning

5. Generalization, Overfitting, and Stopping Criterion

What is an appropriate condition for terminating the weight update loop? One choice is to continue training until the error E on the training examples falls below some predetermined threshold.

To see the dangers of minimizing the error over the training data, consider how the error E varies with the number of weight iterations

Consider first the top plot in this figure. The lower of the two lines shows the monotonically decreasing error E over the training set, as the number of gradient descent iterations grows. The upper line shows the error E measured over a different validation set of examples, distinct from the training examples. This line measures the generalization accuracy of the network-the accuracy with which it fits examples beyond the training data.
The generalization accuracy measured over the validation examples first decreases, then increases, even as the error over the training examples continues to decrease. How can this occur? This occurs because the weights are being tuned to fit idiosyncrasies of the training examples that are not representative of the general distribution of examples. The large number of weight parameters in ANNs provides many degrees of freedom for fitting such idiosyncrasies
Why does overfitting tend to occur during later iterations, but not during earlier iterations? By giving enough weight-tuning iterations, BACKPROPAGATION will often be able to create overly complex decision surfaces that fit noise in the training data or unrepresentative characteristics of the particular training sample.

Random Posts

Wednesday, June 23, 2021

REMARKS ON THE BACKPROPAGATION ALGORITHM

1. Convergence and Local Minima

2. Representational Power of Feedforward Networks

3. Hypothesis Space Search and Inductive Bias

4. Hidden Layer Representations

5. Generalization, Overfitting, and Stopping Criterion

No comments:

Post a Comment

Post Top Ad

Author Details

Socialize

Comments

Ad Code

Facebook

Total Pageviews

Search This Blog

Blog Archive

Ad Home

Pages

Random Posts

Recent Posts

Header Ads

Menu Footer Widget

Social Plugin

Subject Labels

Tags

Advertisement

Advertisement

Sponsor

Popular Posts

Recent in Sports

Random Posts

Popular Posts

Popular Posts

Facebook

Categories

Pages

About Me

Popular Posts

Tags

Send Quick Message

SoraTemplates