开发者

Teaching a Neural Net: Bipolar XOR

开发者 https://www.devze.com 2023-01-25 10:47 出处:网络
I\'m trying to to teach a neural net of 2 inputs, 4 hidden nodes (all in same layer) and 1 output node.The binary representation works fine, but I have problems with the Bipolar.I can\'t figure out wh

I'm trying to to teach a neural net of 2 inputs, 4 hidden nodes (all in same layer) and 1 output node. The binary representation works fine, but I have problems with the Bipolar. I can't figure out why, but the total error will sometimes converge to the same number around 2.xx. My sigmoid is 2/(1+ exp(-x)) - 1. Perhaps I'm sigmoiding in the wrong place. For example to calculate the output error should I be comparing the sigmoided output with the expected value or with the sigmoided expected value?

I was following this website here: http://galaxy.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html , but they use different functions then I was instructed to use. Even when I did try to implement their functions I still ran into the same problem. Either way I get stuck about half the time at the same number (a different number for different implementations). Please tell me if I have made a mistake in my code somewhere or if this is normal (I don't see how it could be). Momentum is set to 0. Is this a common 0 momentum problem? The error functions we are supposed to be using are:

if ui is an output unit

Error(i) = (Ci - ui ) * f'(Si )

if ui is a hidden unit

Error(i) = Error(Output) * weight(i to output) * f'(Si)

public double sigmoid( double x ) {
    double fBipolar, fBinary, temp;
    temp = (1 + Math.exp(-x));
    fBipolar = (2 / temp) - 1;
    fBinary = 1 / temp;
    if(bipolar){
        return fBipolar;
    }else{
        return fBinary;
    }

}

// Initialize the weights to random values.
private void initializeWeights(double neg, double pos) { 
    for(int i = 0; i < numInputs + 1; i++){
        for(int j = 0; j < numHiddenNeurons; j++){
            inputWeights[i][j] = Math.random() - pos;
            if(inputWeights[i][j] < neg || inputWeights[i][j] > pos){
                print("ERROR ");
                print(inputWeights[i][j]);
            }
        }
    }
    for(int i = 0; i < numHiddenNeurons + 1; i++){
        hiddenWeights[i] = Math.random() - pos;
        if(hiddenWeights[i] < neg || hiddenWeights[i] > pos){
            print("ERROR ");
            print(hiddenWeights[i]);
        }
    }
}

// Computes output of the NN without training. I.e. a forward pass
public double outputFor ( double[] argInputVector ) { 
    for(int i = 0; i < numInputs; i++){
        inputs[i] = argInputVector[i];
    }
    double weightedSum = 0;
    for(int i = 0; i < numHiddenNeurons; i++){
        weightedSum = 0;
        for(int j = 0; j < numInputs + 1; j++){
            weightedSum += inputWeights[j][i] * inputs[j];
        }
        hiddenActivation[i] = sigmoid(weightedSum); 
    }

    weightedSum = 0;
    for(int j = 0; j < numHiddenNeurons + 1; j++){
        weightedSum += (hiddenActivation[j] * hiddenWeights[j]);
    }

    return sigmoid(weightedSum);
}

    //Computes the derivative of f
public static double fPrime(double u){
    double fBipolar开发者_运维问答, fBinary;
    fBipolar = 0.5 * (1 - Math.pow(u,2));
    fBinary = u * (1 - u);
    if(bipolar){
        return fBipolar;
    }else{
        return fBinary;
    }
}

// This method is used to update the weights of the neural net.
public double train ( double [] argInputVector, double argTargetOutput ){
    double output = outputFor(argInputVector);
    double lastDelta;

    double outputError = (argTargetOutput - output) * fPrime(output);

    if(outputError != 0){
        for(int i = 0; i < numHiddenNeurons + 1; i++){
            hiddenError[i] = hiddenWeights[i] * outputError * fPrime(hiddenActivation[i]);
            deltaHiddenWeights[i] = learningRate * outputError * hiddenActivation[i] + (momentum * lastDelta);
            hiddenWeights[i] += deltaHiddenWeights[i];
        }

        for(int in = 0; in < numInputs + 1; in++){
            for(int hid = 0; hid < numHiddenNeurons; hid++){
                lastDelta = deltaInputWeights[in][hid];
                deltaInputWeights[in][hid] = learningRate * hiddenError[hid] * inputs[in] + (momentum * lastDelta); 
                inputWeights[in][hid] += deltaInputWeights[in][hid];
            }
        }
    }

    return 0.5 * (argTargetOutput - output) * (argTargetOutput - output);
}


General coding comments:

initializeWeights(-1.0, 1.0);

may not actually get the initial values you were expecting.

initializeWeights should probably have:

inputWeights[i][j] = Math.random() * (pos - neg) + neg;
// ...
hiddenWeights[i] = (Math.random() * (pos - neg)) + neg; 

instead of:

Math.random() - pos;

so that this works:

initializeWeights(0.0, 1.0);

and gives you initial values between 0.0 and 1.0 rather than between -1.0 and 0.0.

lastDelta is used before it is declared:

deltaHiddenWeights[i] = learningRate * outputError * hiddenActivation[i] + (momentum * lastDelta);

I'm not sure if the + 1 on numInputs + 1 and numHiddenNeurons + 1 are necessary.

Remember to watch out for rounding of ints: 5/2 = 2, not 2.5! Use 5.0/2.0 instead. In general, add the .0 in your code when the output should be a double.

Most importantly, have you trained the NeuralNet long enough?

Try running it with numInputs = 2, numHiddenNeurons = 4, learningRate = 0.9, and train for 1,000 or 10,000 times.

Using numHiddenNeurons = 2 it sometimes get "stuck" when trying to solve the XOR problem.

See also XOR problem - simulation

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号