Parameter Initialization Methods
In the NeuPy, initialization methods per layer can be be modified using classes from the init module.
from neupy.layers import *
from neupy import init
network = join(
Input(10),
Sigmoid(30, weight=init.Normal()),
Sigmoid(15, weight=init.Normal()),
)
Initialization class has an ability to generate parameters based on the specified shape. For instance, a first sigmoid layer expects 10 input features and generates 30 output features, which mean that this layer should have weight with shape (10, 30). During initialization, we don’t need to specify the shape of the parameter. This information would be provided to the initializer class during weight initialization procedure.
It’s possible to set up any value for the weight as long as it has valid shape. We can do the same initialization procedure with manually generated weights.
import numpy as np
network = join(
Input(10),
Sigmoid(30, weight=np.random.randn(10, 30)),
Sigmoid(15, weight=np.random.randn(30, 15)),
)
Code above does the same type of the initialization as in the previous example, the only problem that we need to hard-code expected shape of the weights.
More initialization methods you can find here.
Create custom initialization methods
It’s very easy to create custom initialization method. All we need is just to inherit from the init.Initializer class and define sample method that accepts one argument (excluding the self argument). Argument will contain shape of the output tensor that we expect to get.
In the example below, we create custom initializer that samples weights from the exponential distribution.
import tensorflow as tf
from neupy.layers import *
from neupy import init
class Gamma(init.Initializer):
def __init__(self, alpha=0.01):
self.alpha = alpha
def sample(self, shape):
return tf.random.gamma(shape, self.alpha)
network = join(
Input(10),
Sigmoid(30, weight=Gamma(alpha=0.02)),
Sigmoid(15, weight=Gamma(alpha=0.05)),
)
Notice that the sample method returns Tensorflow’s tensor. It’s possible to return numpy’s array, but in this case initialization might take more time, since we will need to generate weights per each variable sequentially.