neupy.layers.recurrent module

class neupy.layers.recurrent.LSTM[source]

Long Short Term Memory (LSTM) Layer.

Parameters:

size : int

Number of hidden units in the layer.

weights : dict or Initializer

Weight parameters for different gates. Defaults to XavierUniform().

  • In case if application requires the same initialization method for all weights, then it’s possible to specify initialization method that would be automaticaly applied to all weight parameters in the LSTM layer.

    layers.LSTM(2, weights=init.Normal(0.1))
    
  • In case if application requires different initialization values for different weights then it’s possible to specify an exact weight by name.

    dict(
        weight_in_to_ingate=init.XavierUniform(),
        weight_hid_to_ingate=init.XavierUniform(),
        weight_cell_to_ingate=init.XavierUniform(),
    
        weight_in_to_forgetgate=init.XavierUniform(),
        weight_hid_to_forgetgate=init.XavierUniform(),
        weight_cell_to_forgetgate=init.XavierUniform(),
    
        weight_in_to_outgate=init.XavierUniform(),
        weight_hid_to_outgate=init.XavierUniform(),
        weight_cell_to_outgate=init.XavierUniform(),
    
        weight_in_to_cell=init.XavierUniform(),
        weight_hid_to_cell=init.XavierUniform(),
    )
    

    If application requires modification to only one (or multiple) parameter then it’s better to specify the one that you need to modify and ignore other parameters

    dict(weight_in_to_ingate=init.Normal(0.1))
    

    Other parameters like weight_cell_to_outgate will be equal to their default values.

biases : dict or Initializer

Bias parameters for different gates. Defaults to Constant(0).

  • In case if application requires the same initialization method for all biases, then it’s possible to specify initialization method that would be automaticaly applied to all bias parameters in the LSTM layer.

    layers.LSTM(2, biases=init.Constant(1))
    
  • In case if application requires different initialization values for different weights then it’s possible to specify an exact weight by name.

    dict(
        bias_ingate=init.Constant(0),
        bias_forgetgate=init.Constant(0),
        bias_cell=init.Constant(0),
        bias_outgate=init.Constant(0),
    )
    

    If application requires modification to only one (or multiple) parameter then it’s better to specify the one that you need to modify and ignore other parameters

    dict(bias_ingate=init.Constant(1))
    

    Other parameters like bias_cell will be equal to their default values.

activation_functions : dict, callable

Activation functions for different gates. Defaults to:

# import theano.tensor as T
dict(
    ingate=T.nnet.sigmoid,
    forgetgate=T.nnet.sigmoid,
    outgate=T.nnet.sigmoid,
    cell=T.tanh,
)

If application requires modification to only one parameter then it’s better to specify the one that you need to modify and ignore other parameters

dict(ingate=T.tanh)

Other parameters like forgetgate or outgate will be equal to their default values.

learn_init : bool

If True, make cell_init and hid_init trainable variables. Defaults to False.

cell_init : array-like, Theano variable, scalar or Initializer

Initializer for initial cell state (\(c_0\)). Defaults to Constant(0).

hid_init : array-like, Theano variable, scalar or Initializer

Initializer for initial hidden state (\(h_0\)). Defaults to Constant(0).

backwards : bool

If True, process the sequence backwards and then reverse the output again such that the output from the layer is always from \(x_1\) to \(x_n\). Defaults to False

only_return_final : bool

If True, only return the final sequential output (e.g. for tasks where a single target value for the entire sequence is desired). In this case, Theano makes an optimization which saves memory. Defaults to True.

precompute_input : bool

if True, precompute input_to_hid before iterating through the sequence. This can result in a speed up at the expense of an increase in memory usage. Defaults to True.

peepholes : bool

If True, the LSTM uses peephole connections. When False, cell parameters are ignored. Defaults to False.

unroll_scan : bool

If True the recursion is unrolled instead of using scan. For some graphs this gives a significant speed up but it might also consume more memory. When unroll_scan=True, backpropagation always includes the full sequence, so n_gradient_steps must be set to -1 and the input sequence length must be known at compile time (i.e., cannot be given as None). Defaults to False.

gradient_clipping : flaot or int

If nonzero, the gradient messages are clipped to the given value during the backward pass. Defaults to 0.

n_gradient_steps : int

Number of timesteps to include in the backpropagated gradient. If -1, backpropagate through the entire sequence. Defaults to -1.

name : str or None

Layer’s identifier. If name is equal to None than name will be generated automatically. Defaults to None.

Notes

Code was adapted from the Lasagne library.

Examples

Sequence classification

from neupy import layers, algorithms

n_time_steps = 40
n_categories = 20
embedded_size = 10

network = algorithms.RMSProp(
    [
        layers.Input(n_time_steps),
        layers.Embedding(n_categories, embedded_size),
        layers.LSTM(20),
        layers.Sigmoid(1),
    ]
)
activation_functions = None[source]
backwards = None[source]
biases = None[source]
cell_init = None[source]
gradient_clipping = None[source]
hid_init = None[source]
initialize()[source]

Initialize connection

learn_init = None[source]
n_gradient_steps = None[source]
options = {'hid_init': Option(class_name='LSTM', value=ParameterProperty(name="hid_init")), 'biases': Option(class_name='LSTM', value=MultiParameterProperty(name="biases")), 'gradient_clipping': Option(class_name='LSTM', value=NumberProperty(name="gradient_clipping")), 'size': Option(class_name='BaseRNNLayer', value=IntProperty(name="size")), 'backwards': Option(class_name='LSTM', value=Property(name="backwards")), 'name': Option(class_name='BaseLayer', value=Property(name="name")), 'peepholes': Option(class_name='LSTM', value=Property(name="peepholes")), 'precompute_input': Option(class_name='LSTM', value=Property(name="precompute_input")), 'weights': Option(class_name='LSTM', value=MultiParameterProperty(name="weights")), 'n_gradient_steps': Option(class_name='LSTM', value=IntProperty(name="n_gradient_steps")), 'activation_functions': Option(class_name='LSTM', value=MultiCallableProperty(name="activation_functions")), 'unroll_scan': Option(class_name='LSTM', value=Property(name="unroll_scan")), 'cell_init': Option(class_name='LSTM', value=ParameterProperty(name="cell_init")), 'learn_init': Option(class_name='LSTM', value=Property(name="learn_init")), 'only_return_final': Option(class_name='BaseRNNLayer', value=Property(name="only_return_final"))}[source]
output(input_value)[source]

Return output base on the input value.

Parameters:input_value
peepholes = None[source]
precompute_input = None[source]
unroll_scan = None[source]
weights = None[source]
class neupy.layers.recurrent.GRU[source]

Gated Recurrent Unit (GRU) Layer.

Parameters:

size : int

Number of hidden units in the layer.

weights : dict or Initializer

Weight parameters for different gates. Defaults to XavierUniform().

  • In case if application requires the same initialization method for all weights, then it’s possible to specify initialization method that would be automaticaly applied to all weight parameters in the GRU layer.

    layers.GRU(2, weights=init.Normal(0.1))
    
  • In case if application requires different initialization values for different weights then it’s possible to specify an exact weight by name.

    dict(
        weight_in_to_updategate=init.XavierUniform(),
        weight_hid_to_updategate=init.XavierUniform(),
    
        weight_in_to_resetgate=init.XavierUniform(),
        weight_hid_to_resetgate=init.XavierUniform(),
    
        weight_in_to_hidden_update=init.XavierUniform(),
        weight_hid_to_hidden_update=init.XavierUniform(),
    )
    

    If application requires modification to only one (or multiple) parameter then it’s better to specify the one that you need to modify and ignore other parameters

    dict(weight_in_to_updategate=init.Normal(0.1))
    

    Other parameters like weight_in_to_resetgate will be equal to their default values.

biases : dict or Initializer

Bias parameters for different gates. Defaults to Constant(0).

  • In case if application requires the same initialization method for all biases, then it’s possible to specify initialization method that would be automaticaly applied to all bias parameters in the GRU layer.

    layers.GRU(2, biases=init.Constant(1))
    
  • In case if application requires different initialization values for different weights then it’s possible to specify an exact weight by name.

    dict(
        bias_updategate=init.Constant(0),
        bias_resetgate=init.Constant(0),
        bias_hidden_update=init.Constant(0),
    )
    

    If application requires modification to only one (or multiple) parameter then it’s better to specify the one that you need to modify and ignore other parameters

    dict(bias_resetgate=init.Constant(1))
    

    Other parameters like bias_updategate will be equal to their default values.

activation_functions : dict, callable

Activation functions for different gates. Defaults to:

# import theano.tensor as T
dict(
    resetgate=T.nnet.sigmoid,
    updategate=T.nnet.sigmoid,
    hidden_update=T.tanh,
)

If application requires modification to only one parameter then it’s better to specify the one that you need to modify and ignore other parameters

dict(resetgate=T.tanh)

Other parameters like updategate or hidden_update will be equal to their default values.

learn_init : bool

If True, make hid_init trainable variable. Defaults to False.

hid_init : array-like, Theano variable, scalar or Initializer

Initializer for initial hidden state (\(h_0\)). Defaults to Constant(0).

only_return_final : bool

If True, only return the final sequential output (e.g. for tasks where a single target value for the entire sequence is desired). In this case, Theano makes an optimization which saves memory. Defaults to True.

backwards : bool

If True, process the sequence backwards and then reverse the output again such that the output from the layer is always from \(x_1\) to \(x_n\). Defaults to False.

precompute_input : bool

if True, precompute input_to_hid before iterating through the sequence. This can result in a speed up at the expense of an increase in memory usage. Defaults to True.

unroll_scan : bool

If True the recursion is unrolled instead of using scan. For some graphs this gives a significant speed up but it might also consume more memory. When unroll_scan=True, backpropagation always includes the full sequence, so n_gradient_steps must be set to -1 and the input sequence length must be known at compile time (i.e., cannot be given as None). Defaults to False.

name : str or None

Layer’s identifier. If name is equal to None than name will be generated automatically. Defaults to None.

Notes

Code was adapted from the Lasagne library.

Examples

Sequence classification

from neupy import layers, algorithms

n_time_steps = 40
n_categories = 20
embedded_size = 10

network = algorithms.RMSProp(
    [
        layers.Input(n_time_steps),
        layers.Embedding(n_categories, embedded_size),
        layers.GRU(20),
        layers.Sigmoid(1),
    ]
)
activation_functions = None[source]
backwards = None[source]
biases = None[source]
gradient_clipping = None[source]
hid_init = None[source]
initialize()[source]

Initialize connection

learn_init = None[source]
n_gradient_steps = None[source]
options = {'hid_init': Option(class_name='GRU', value=ParameterProperty(name="hid_init")), 'biases': Option(class_name='GRU', value=MultiParameterProperty(name="biases")), 'gradient_clipping': Option(class_name='GRU', value=NumberProperty(name="gradient_clipping")), 'size': Option(class_name='BaseRNNLayer', value=IntProperty(name="size")), 'backwards': Option(class_name='GRU', value=Property(name="backwards")), 'name': Option(class_name='BaseLayer', value=Property(name="name")), 'precompute_input': Option(class_name='GRU', value=Property(name="precompute_input")), 'weights': Option(class_name='GRU', value=MultiParameterProperty(name="weights")), 'n_gradient_steps': Option(class_name='GRU', value=IntProperty(name="n_gradient_steps")), 'activation_functions': Option(class_name='GRU', value=MultiCallableProperty(name="activation_functions")), 'unroll_scan': Option(class_name='GRU', value=Property(name="unroll_scan")), 'learn_init': Option(class_name='GRU', value=Property(name="learn_init")), 'only_return_final': Option(class_name='BaseRNNLayer', value=Property(name="only_return_final"))}[source]
output(input_value)[source]

Return output base on the input value.

Parameters:input_value
precompute_input = None[source]
unroll_scan = None[source]
weights = None[source]