Reinforcement Learning with TensorFlow
上QQ阅读APP看书,第一时间看更新

Why do we use xavier initialization?

The following factors call for the application of xavier initialization:

  • If the weights in a network start very small, most of the signals will shrink and become dormant at the activation function in the later layers

  • If the weights start very large, most of the signals will massively grow and pass through the activation functions in the later layers

Thus, xavier initialization helps in generating optimal weights, such that the signals are within optimal range, thereby minimizing the chances of the signals getting neither too small nor too large.

The derivation of the preceding formula is beyond the scope of this book. Feel free to search here (http://andyljones.tumblr.com/post/110998971763/an-explanation-of-xavier-initialization) and go through the derivation for a better understanding.