Linear
Perceptron Networks
The objective of the learning phase is to determine the best discrimnant
functions, which in turn dictate the decision boundaries. The linear perceptron
was designated to separate two classes by a linear decision boundary, and
it has later evolved into a good number of more sophisticated variants.
We will distinguish between the Linear Perceptron
for Binary Classification and Linear Perceptron
for Multiple Classification.
Linear Perceptron for Binary Classification
The basic structure of a linear perceptron is shown in
this figure, with a linear discriminant
function
-

We can regard for convenience the threshold value
just as an aditional weight parameter. Denote
,
then
-

and
-

that is, z is the augmented pattern x. Now the linear discrimnant
function can be rewritten as
-

Recall that the decision value is the binary, that is,
-

A pattern is classified as
when d
=1, it belongs to
. The teacher determines
whether the pattern is correctly classified. When and only when a misclassification
occurs, the network will be adjusted.
Algorithm
Upon the presentation of the mth training pattern
,
the weight vector
is updated as
-

where
is a positive learning rate.
More precisely, the above learning rule can be viewed from two perspectives:
The training will take as many sweeps
as required, in each sweep all the M
training patterns are presented. At the end of each sweep, the initial
weights
are set to
before the next sweep is started. If
there is no misclassification over one entire sweep,
thus no learning incurs in the sweep
and the training process should be terminated.
Constant Learning Rate
The convergence speed for a constant-rate perceptron varies greatly,
depending on the choice of learning rates. If it is too small, it will
be very slow. On the other hand, if it is too large, it can cause numerical
problems. Te convergence speed does not depend on how large is the region
of feasible solution in the w-space.
Linear Perceptron for Multiple Classification
The basic percptron can be extended to the problem of classifying multiple
(e.g., L) classes. For this purpose, the following important features are
incorporated into the general DBNN:
-
One subnet is designated for one class, that is, a OCON
structure. See this figure.
-
The linear discriminant functions for the subnets are denoted as
,
for i = 1, ..., L. The discriminant function provides the score
for each subnet (or each class).
-
A MAXNET is used to
select the subnet (or class) with the winning score.
-
The output is usually a symbol labeling the winner of the subnets. See
next figure
The following mutual training scheme can be used. This output
symbol will be compared with the teacher symbol. If the two symbols match,
then the network will be left alone until a future training pattern is
presented. If the net mismatch, then the weights will bw updated by the
reinforced and antireinforced learning rules.
Algorithm
Supose that
is a set of given
training patterns, with each element
belonging to one of the L classes
;
and that the discriminant functions are
for i = 1, ..., L. Suppose that the
mth pattern
presented is known to belong to class
;
and that the winning class for the pattern is denoted by an integer j,
that is, for all
,
-

-
When j=i, then the pattern
is
already correctly classified, so no update will be needed.
-
When
, that is,
is still misclassified, then the following update will be performed:
-
Reinforced Learning:

-
Antireinforced Learning:

The other weights remain unchanged:
for al
and
.
Decision Based Neural Networks
Contents
Artificial Neural Networks
About this Tutorial