Algorithms with constructible architecture
Contents
Specify network structure
There are two ways to define relations between layers. We can define network’s architecture separately from the training algorithm.
from neupy import algorithms
from neupy.layers import *
network = Input(10) >> Sigmoid(40) >> Softmax(4)
optimizer = algorithms.GradientDescent(
network, step=0.2, shuffle_data=True, verbose=True,
)
Or, we can set up a list of layers that define sequential relations between layers.
from neupy import algorithms
optimizer = algorithms.GradientDescent(
[
Input(10),
Sigmoid(40),
Softmax(4),
],
step=0.2,
shuffle_data=True
verbose=True,
)
This is just a syntax simplification that allows to avoid using join function and inline connections.
Train networks with multiple inputs
NeuPy allows to train networks with multiple inputs.
from neupy import algorithms
from neupy.layers import *
optimizer = algorithms.GradientDescent(
[
parallel([
# 3 categorical inputs
Input(3),
Embedding(n_unique_categories, 4),
Reshape(),
], [
# 17 numerical inputs
Input(17),
]),
Concatenate(),
Relu(16),
Sigmoid(1),
],
step=0.5,
verbose=True,
loss='binary_crossentropy',
)
x_train_cat, x_train_num, y_train = load_train_data()
x_test_cat, x_test_num, y_test = load_test_data()
# Categorical variable should be the first, because
# categorical input layer was defined first in the network
optimizer.train(
[x_train_cat, x_train_num], y_train,
[x_test_cat, x_test_num], y_test,
epochs=180,
)
y_predicted = optimizer.predict(x_test_cat, x_test_num)
Network in the example above has two inputs. Order of the inputs is important since first specified layer in the network will correspond to the first networks input. It’s tru for the train, score and predict methods
optimizer.train(
[x_train_cat, x_train_num], y_train,
[x_test_cat, x_test_num], y_test,
epochs=180,
)
loss = optimizer.score([x_test_cat, x_test_num], y_test)
y_predicted = optimizer.predict(x_test_cat, x_test_num)
Notice that predict method expects multiple inputs, unlike score and train methods. It’s because for other methods it’s important to differentiate between inputs and targets.
Algorithms
NeuPy supports lots of different training algorithms based on the backpropagation. You can check Cheat sheet if you want to learn more about them.
Before using these algorithms you must understand that not all of them are suitable for all problems. Some of the methods like Levenberg-Marquardt or Conjugate Gradient work better for small networks and they would be extremely slow for networks with millions parameters. In addition, it’s important to note that not all algorithms are possible to train with mini-batches. Algorithms like Conjugate Gradient don’t work with mini-batches.
Loss functions
NeuPy has many different loss functions. These loss functions can be specified specified as a string.
from neupy import algorithms, layers
nnet = algorithms.GradientDescent(
[
layers.Input(784),
layers.Relu(500),
layers.Relu(300),
layers.Softmax(10),
],
loss='categorical_crossentropy',
)
Also, it’s possible to create custom loss functions. Loss function should have two mandatory arguments, namely expected and predicted values.
import tensorflow as tf
from neupy import algorithms, layers
def mean_absolute_error(expected, predicted):
abs_errors = tf.abs(expected - predicted)
return tf.reduce_mean(abs_errors)
nnet = algorithms.GradientDescent(
[
layers.Input(784),
layers.Relu(500),
layers.Relu(300),
layers.Softmax(10),
],
loss=mean_absolute_error,
)
Loss function should return a scalar, because during the training output from the loss function will be used as a variable with respect to which we are differentiating.