Algorithms with Constructible Network Architecture
Contents
Set up connections
There are three ways to define relations between layers inside the algorithm. We can use predefined networks as a first argument
from neupy import algorithms, layers
network = layers.join(
layers.Input(10),
layers.Sigmoid(40),
layers.Sigmoid(2),
)
bpnet = algorithms.GradientDescent(
network,
step=0.2,
shuffle_data=True
)
Or we can set up a list of layers that define sequential relations between layers.
from neupy import algorithms, layers
bpnet = algorithms.GradientDescent(
[
layers.Input(10),
layers.Sigmoid(40),
layers.Sigmoid(2)
layers.Softmax(2),
],
step=0.2,
shuffle_data=True
)
This is just a syntax simplification that allows avoiding layer.join function usage.
Small networks can be defined with a help of inline operator.
from neupy import algorithms
from neupy.layers import *
bpnet = algorithms.GradientDescent(
Input(10) > Sigmoid(40) > Sigmoid(2),
step=0.2,
shuffle_data=True
)
And the last way to define connections is just to set up value equal to a list or a tuple of integers.
from neupy import algorithms
bpnet = algorithms.GradientDescent((2, 4, 1))
It’s obvious that during initialization we didn’t actually set up any layer types. By default NeuPy constructs from the tuple simple MLP network that contains dense layers with sigmoid activation function. This type of initialization typically suitable for tests or model benchmarks.
Train networks with multiple inputs
NeuPy allows to train networks with multiple inputs.
from neupy import algorithms, layers
gdnet = algorithms.GradientDescent(
[
[[
# 3 categorical inputs
layers.Input(3),
layers.Embedding(n_unique_categories, 4),
layers.Reshape(),
], [
# 17 numerical inputs
layers.Input(17),
]],
layers.Concatenate(),
layers.Relu(16),
layers.Sigmoid(1)
],
step=0.5,
verbose=True,
error='binary_crossentropy',
)
x_train_cat, x_train_num, y_train = load_train_data()
x_test_cat, x_test_num, y_test = load_test_data()
# Categorical variable should be the first, becuase
# categorical input layer was defined first in the network
network.train([x_train_cat, x_train_num], y_train,
[x_test_cat, x_test_num], y_test,
epochs=180)
y_predicted = network.predict([x_test_cat, x_test_num])
Algorithms
NeuPy supports lots of different training algorithms based on the backpropagation. You can check Cheat sheet if you want to learn more about them.
Before using these algorithms you must understand that not all of them are suitable for all problems. Some of the methods like Levenberg-Marquardt or Conjugate Gradient work better for small networks and they would be extremely slow for networks with millions of parameters. In addition, it’s important to note that not all algorithms are possible to train with mini-batches. Algorithms like Conjugate Gradient are not able to train properly with mini-batch.
Error functions
NeuPy has many different error functions.
from neupy import algorithms, layers
nnet = algorithms.MinibatchGradientDescent(
[
layers.Input(784),
layers.Relu(500),
layers.Relu(300),
layers.Softmax(10),
],
error='categorical_crossentropy',
)
Also, it’s possible to create custom error functions. Error function should have two mandatory arguments.
import theano.tensor as T
from neupy import algorithms, layers
def mean_absolute_error(expected, predicted):
return T.abs_(expected - predicted).mean()
nnet = algorithms.MinibatchGradientDescent(
[
layers.Input(784),
layers.Relu(500),
layers.Relu(300),
layers.Softmax(10),
],
error=mean_absolute_error,
)
Error function should return a scalar because during the training output from the error function will be used as a variable with respect to which we are differentiating
Add-ons
Algorithms with constructible architectures allow to use additional update rules for parameter regularization and step update. For instance, we want to add Weight Decay regularization and we want to minimize step monotonically after each epoch.
from neupy import algorithms, layers
nnet = algorithms.MinibatchGradientDescent(
[
layers.Input(784),
layers.Relu(500),
layers.Relu(300),
layers.Softmax(10),
],
step=0.1,
batch_size=16,
addons=[algorithms.WeightDecay,
algorithms.StepDecay]
)
Both WeightDecay and StepDecay algorithms have additional parameters. In case if we need to modify them we can add them to the training algorithm.
from neupy import algorithms, layers
nnet = algorithms.MinibatchGradientDescent(
[
layers.Input(784),
layers.Relu(500),
layers.Relu(300),
layers.Softmax(10),
],
# Parameters from MinibatchGradientDescent
step=0.1,
batch_size=16,
# Parameters from StepDecay
reduction_freq=50,
# Parameters from WeightDecay
decay_rate=0.05,
addons=[algorithms.WeightDecay,
algorithms.StepDecay]
)
NeuPy doesn’t allow to use multiple regularizations and step update add-ons for training algorithm.
>>> from neupy import algorithms, layers
>>>
>>> nnet = algorithms.MinibatchGradientDescent(
... [
... layers.Input(784),
... layers.Relu(500),
... layers.Relu(300),
... layers.Softmax(10),
... ],
... addons=[
... algorithms.WeightDecay,
... algorithms.WeightElimination,
... ]
... )
Traceback (most recent call last):
ValueError: There can be only one add-on class with type 'Regularization'