开发者

MATLAB neural nets: trainbfg problems when using a custom performance function

开发者 https://www.devze.com 2023-03-21 09:53 出处:网络
I have written my own custom performance function, that is a cross entropy function with some modifications, called augmented cross entropy function.

I have written my own custom performance function, that is a cross entropy function with some modifications, called augmented cross entropy function.

My performance function itselft is a sum of two functions: cross entropy function F and a penalty function P, the formula given below:

MATLAB neural nets: trainbfg problems when using a custom performance function

MATLAB neural nets: trainbfg problems when using a custom performance function

where B and vectors e1 and e2 are just some constants and w is a weight matrix (i for hidden layer neurons, j for input layer neurons).

I've implemented dy and dx derivatives, not being very sure about the dx derivative (where x is a result of getx function - it holds all weight and bias information). I assumed that the dx derivative of my performance function for a weight wij will be equal to derivative of the penalty function:

MATLAB neural nets: trainbfg problems when using a custom performance function

Then I started training my neural network with trainbfg function and found out it does not learn anything. Message was "Line search did not find new minimum". From trainbfg description:

Each variable is adjusted according to the following: X = X + a*dX; where dX is the search direction. The parameter a is selected to minimize the performance along the search direction.

It turned out that parameter a is always calculated as 0 by the default search function, srchbac (line search). I assume it has something to do with my performance function being wrongfully implemented, because when I set mse as the performance function, a is calculated properly.

What is the reason of the problems during locating a new minimum by the srchbac function? Just to know where I should look for as for a second day I found开发者_JS百科 nothing.

Edit:

The x vector consists of input-hidden connections' weight values first and then the rest biases and weights. I calculate the dx derivative of the weights vector with the following formula:

res = 2 .* E1 .* b .* W ./( 1 + b .* W.^2).^2  + 2 .* E2 .* W ;

and the rest of the values I set to 0 (so that res has the same length as the x vector).

0

精彩评论

暂无评论...
验证码 换一张
取 消