MagnitudeofWeightsvsFeaturesforOutputLayerofRNN

I have a question regarding the magnitude of weights and features for the output layer of an RNN. The RNN outputs a hidden layer matrix h of dimensions (64, M, N) which is then reshaped into a (64, M*N). This hidden matrix is then transformed into an output vector in my output layer where the matrix is multiplied by a (1, 64) vector.

Thus we have a (1,64) X (64, M x N) to produce a (1, M x N) vector which is then reshaped into an (M,N) image and put through a sigmoid output function.

My question is regarding what the magnitude of the weights and hidden matrix should be. For example, consider the (1,64) vector being multiplied by the first column of the hidden matrix.

We would get w_1*h_1 + w_2*h_2 + ... + w_64*h_64. I am having trouble initializing the weights for the connected layer and the RNN weights because I am not sure what the average magnitude of the weights should be.

For example if w is too large and h is too small, the RNN isn't very helpful in producing a value. If w is too small and h is too large, a good result isn't produced. If both w and h are large, the model runs into problems as well.

Is there some baseline of what the average values of w and h should be?

All Questions Answered

Search This Blog

Donate. I desperately need donations to survive due to my health

Get paid by answering surveys Click here

Click here to donate

Remote/Work from Home jobs

MagnitudeofWeightsvsFeaturesforOutputLayerofRNN

Comments

Post a Comment