I have implemented a multilayer perceptron to predict the sin of input vectors. The vectors consist of four -1,0,1's chosen at random and a bias set to 1. The network should predict the sin of sum of the vectors contents.
eg Input = <0,1,-1,0,1> Output = Sin(0+1+(-1)+0+1)
The problem I am having is that the network will never predict a negative value and many of the vectors' sin values are negative. It predicts all positive or zero outputs perfectly. I am presuming that there is a problem with updating the weights, which are updated after every epoch. Has anyone encountered this problem with NN's before? Any help 开发者_Python百科at all would be great!!
note: The network has 5inputs,6hidden units in 1 hidden layer and 1 output.I am using a sigmoid function on the activations hidden and output layers, and have tried tonnes of learning rates (currently 0.1);
Being a long time since I looked into multilayer perceptrons hence take this with a grain of salt.
I'd rescale your problem domain to the [0,1] domain instead of [-1,1]. If you take a look at the logistic function graph:
It generates values between [0,1]. I do not expect it to produce negative results. I might be wrong, tough.
EDIT:
You can actually extend the logistic function to your problem domain. Use the generalized logistic curve setting A and K parameters to the boundaries of your domain.
Another option is the hyperbolic tangent, which goes from [-1,+1] and has no constants to set up.
There are many different kinds of activation functions, many of which are designed to output a value from 0 to 1. If you're using a function that only outputs between 0 and 1, try adjusting it so that it outputs between 1 and -1. If you were using FANN I would tell you to use the FANN_SIGMOID_SYMMETRIC activation function.
Although the question has already been answered, allow me to share my experience. I have been trying to approximate Sine function using a 1--4--1 neural network. i.e,
I used Sigmoid activation and its derivative defined as:
double sigmoid(double x)
{
return 1.0f / (1.0f + exp(-x));
}
double Sigmoid_derivative(double x)
{
return x * (1.0f - x);
}
And this is what I got after 10,000 epochs, training the network on 20 Training Examples.
As, you can see, the network didn't feel like the negative curve. So, I changed the activation function to Tanh.
double tanh(double x)
{
return (exp(x)-exp(-x))/(exp(x)+exp(-x));
}
double tanh_derivative(double x)
{
return 1.0f - x*x ;
}
And surprisingly, after half the epochs, (i.e., 5000), I got a far better curve.
精彩评论