I have trained xor neural network in MATLAB and got these weights:
iw: 开发者_Python百科[-2.162 2.1706; 2.1565 -2.1688]
lw: [-3.9174 -3.9183]
b{1} [2.001; 2.0033]
b{2} [3.8093]
Just from curiosity I have tried to write MATLAB code which computes the output of this network (two neurons in the hidden layer, and one in the output, TANSIG activation function).
Code that I got:
l1w = [-2.162 2.1706; 2.1565 -2.1688];
l2w = [-3.9174 -3.9183];
b1w = [2.001 2.0033];
b2w = [3.8093];
input = [1, 0];
out1 = tansig (input(1)*l1w(1,1) + input(2)*l1w(1,2) + b1w(1));
out2 = tansig (input(1)*l1w(2,1) + input(2)*l1w(2,2) + b1w(2));
out3 = tansig (out1*l2w(1) + out2*l2w(2) + b2w(1))
The problem is when input is lets say [1,1], it outputs -0.9989, when [0,1] 0.4902. While simulating network generated with MATLAB outputs adequately are 0.00055875 and 0.99943.
What am I doing wrong?
I wrote a simple example of an XOR network. I used newpr
, which defaults to tansig
transfer function for both hidden and output layers.
input = [0 0 1 1; 0 1 0 1]; %# each column is an input vector
ouputActual = [0 1 1 0];
net = newpr(input, ouputActual, 2); %# 1 hidden layer with 2 neurons
net.divideFcn = ''; %# use the entire input for training
net = init(net); %# initialize net
net = train(net, input, ouputActual); %# train
outputPredicted = sim(net, input); %# predict
then we check the result by computing the output ourselves. The important thing to remember is that by default, inputs/outputs are scaled to the [-1,1] range:
scaledIn = (2*input - 1); %# from [0,1] to [-1,1]
for i=1:size(input,2)
in = scaledIn(:,i); %# i-th input vector
hidden(1) = tansig( net.IW{1}(1,1)*in(1) + net.IW{1}(1,2)*in(2) + net.b{1}(1) );
hidden(2) = tansig( net.IW{1}(2,1)*in(1) + net.IW{1}(2,2)*in(2) + net.b{1}(2) );
out(i) = tansig( hidden(1)*net.LW{2,1}(1) + hidden(2)*net.LW{2,1}(2) + net.b{2} );
end
scaledOut = (out+1)/2; %# from [-1,1] to [0,1]
or more efficiently expressed as matrix product in one line:
scaledIn = (2*input - 1); %# from [0,1] to [-1,1]
out = tansig( net.LW{2,1} * tansig( net.IW{1}*scaledIn + repmat(net.b{1},1,size(input,2)) ) + repmat(net.b{2},1,size(input,2)) );
scaledOut = (1 + out)/2; %# from [-1,1] to [0,1]
You usually don't use a sigmoid on your output layer--are you sure you should have the tansig on out3? And are you sure you are looking at the weights of the appropriately trained network? It looks like you've got a network trained to do XOR on [1,1] [1,-1] [-1,1] and [-1,-1], with +1 meaning "xor" and -1 meaning "same".
精彩评论