This network is called a three-layer network because it has three weight layers. In linear systems, there si no real benefit to cascading multiple layers of linear networks. The equivalent weight matrix for the total system is simply the product of the weight matrices of different layers.
The situation is quite different if nonlinear hidden neuron unit are inserted between the input and the output layers. In this case, it seems natural to assume that the more layers used, the greater power the networks possesses. However, it is not the case in practice. An excessive number of layers often proves to be improductive. It may cause slower convergence in the backpropagation learning. Two posible reasons are that the error signals maybe numerically degraded when propagating across too many layers and that extra layers tend to create additional local minima. Thus, it is essential to identify the proper number of layers. Generally speaking, two layer network should be adequate as universal approximators of any nonlinear functions. It has been futher demonstrated that a three layer network suffices to separate any (convex or nonconvex) polyhedral decision region from its background. In summary, two or three layers should be adequate for most applications.
The Bacpropagation algorithm offers an effective approach to the computation of the gradients. This can be applied to any optimization formulation as well as the DBNN formulation.
A linear basis function (LBF) multilayer network is characterized by the following dynamics equations
The objective of this algorithm is to train the weights
so as to minimize E. The basic gradient-type learning formula is
The aforementioned algorithm can be applied to training approximation-based
networks. In this case, the objective is to train the weights
and the thresholds
, so as to minimize
the least-squares-error between the teacher and the actual response. That
is,
For an energy function other than the LSE, the initial condition
can be derived.
in the sequence of
From this equation, it follows that
The recursive formula is the key to back-propagation learning. It allows
the error signal of a lower layer
to be computed as a linear combination of the error signal of the upper
layer
. In this manner, the error
signals
are back propagated through
all the layers from the top to the down. This also implies that the influences
from an upper layer to a lower layer (and vice versa) can only be effected
via the error signals of the intermediate layer.
Contents
Artificial Neural Networks