Skip to content

leCunNormal initializer #8594

@vmukhachev

Description

@vmukhachev

tf.moments(tf.initializers.leCunNormal().apply([1, 1, 1, 1000000], 'float32')).variance.print()
prints something like:
0.7735087871551514

while it should be close to 1.0

The stddev passed to truncated normal distribution should be scaled by 1.0 / 0.87962566103423978 like it is done in https://github.com/keras-team/tf-keras/blob/v2.19.0/tf_keras/initializers/initializers.py#L662 .
This is because truncated normal distribution has different variance to stdev relationship from regular normal distribution.
The downscaling constant (0.87962566103423978 in this case) can be calculated like:
sqrt(1 - 4 / sqrt(2 * pi) * exp(-0.5 * x**2) / erf(x / sqrt(2))), where x is the bound of the distribution over stddev (x = 2 in the code)

unfortunately the documentation is also ignoring that fact, so it is not clear what to change

Metadata

Metadata

Assignees

Labels

type:bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions