I have seen a few times people using -1
as opposed to 0
when working with neural networks for the input data. How is this better and does it effect any of the mathematics to implement it?
Edit: Using feedforward and back prop
Edit 2: I gave it a go but the network stopped learning so I assume the maths would have to change somewhere?
Edit 3: Finally found the answer. The mathematics for binary is different to bipolar. See my answer b开发者_如何转开发elow.
Recently found that the sigmoid and sigmoid derivative formula needs to change if using bipolar over binary.
Bipolar Sigmoid Function: f(x) = -1 + 2 / (1 + e^-x)
Bipolar Sigmoid Derivative: f’(x) = 0.5 * (1 + f(x)) * (1 – f(x) )
It's been a long time, but as I recall, it has no effect on the mathematics needed to implement the network (assuming you're not working with a network type that for some reason limits any part of the process to non-negative values). One of the advantages is that it makes a larger distinction between inputs, and helps amplify the learning signal. Similarly for outputs.
Someone who's done this more recently probably has more to say (like about whether the 0-crossing makes a difference; I think it does). And in reality some of this depends on exactly what type of neural network you're using. I'm assuming you're talking about backprop or a variant thereof.
The network learns quickly using -1/1 inputs compared to 0/1. Also, if you use -1/1 inputs, 0 means "unknown entry/noise/does not matter". I would use -1/1 as input of my neural network.
精彩评论