I came up with this code:
def DSigmoid(value):
return (math.exp(float(value))/((1+math.exp(float(value)))**2))
a.) Will this return the correct derivative?
b.) Is this an effici开发者_如何学JAVAent method?Friendly regards,
DaquickerLooks correct to me. In general, two good ways of checking such a derivative computation are:
Wolfram Alpha. Inputting the sigmoid function
1/(1+e^(-t))
, we are given an explicit formula for the derivative, which matches yours. To be a little more direct, you can inputD[1/(1+e^(-t)), t]
to get the derivative without all the additional information.Compare it to a numerical approximation. In your case, I will assume you already have a function
Sigmoid(value)
. TakingDapprox = (Sigmoid(value+epsilon) - Sigmoid(value)) / epsilon
for some small
epsilon
and comparing it to the output of your functionDSigmoid(value)
should catch all but the tiniest errors. In general, estimating the derivative numerically is the best way to double check that you've actually coded the derivative correctly, even if you're already sure about the formula, and it takes almost no effort.
In case numerical stability is an issue, there is another possibility: provided that you have a good implementation of the sigmoid available (such as in scipy) you can implement it as:
from scipy.special import expit as sigmoid
def sigmoid_grad(x):
fx = sigmoid(x)
return fx * (1 - fx)
Note that this is mathematically equivalent to the other expression.
In my case this solution worked, while the direct implementation caused floating point overflows when computing exp(-x)
.
精彩评论