neupy.algorithms.competitive.growing_neural_gas module

class neupy.algorithms.competitive.growing_neural_gas.GrowingNeuralGas[source]

Growing Neural Gas (GNG) algorithm.

Current algorithm has two modifications that hasn’t been mentioned in the paper, but they help to speed up training.

  • The n_start_nodes parameter provides possibility to increase number of nodes during initialization step. It’s usefull when algorithm takes a lot of time building up large amount of neurons.
  • The min_distance_for_update parameter allows to speed up training when some data samples has eurons very close to them. The min_distance_for_update parameter controls threshold for the minimum distance for which we will want to update weights.
Parameters:

n_inputs : int

Number of features in each sample.

n_start_nodes : int

Number of nodes that algorithm generates from the data during the initialization step. Defaults to 2.

step : float

Step (learning rate) for the neuron winner. Defaults to 0.2.

neighbour_step : float

Step (learning rate) for the neurons that connected via edges with neuron winner. This value typically has to be smaller than step value. Defaults to 0.05.

max_edge_age : int

It means that if edge won’t be updated for max_edge_age iterations than it would be removed. The larger the value the more updates we allow to do before removing edge. Defaults to 100.

n_iter_before_neuron_added : int

Each n_iter_before_neuron_added weight update algorithm add new neuron. The smaller the value the more frequently algorithm adds new neurons to the network. Defaults to 1000.

error_decay_rate : float

This error decay rate would be applied to every neuron in the graph after each training iteration. It ensures that old errors will be reduced over time. Defaults to 0.995.

after_split_error_decay_rate : float

This decay rate reduces error for neurons with largest errors after algorithm added new neuron. This value typically lower than error_decay_rate. Defaults to 0.5.

max_nodes : int

Maximum number of nodes that would be generated during the training. This parameter won’t stop training when maximum number of nodes will be exceeded. Defaults to 1000.

min_distance_for_update : float

Parameter controls for which neurons we want to apply updates. In case if euclidian distance between data sample and closest neurons will be less than the min_distance_for_update value than update would be skipped for this data sample. Setting value to zero will disable effect provided by this parameter. Defaults to 0.

show_epoch : int or str

This property controls how often the network will display information about training. There are two main syntaxes for this property.

  • You can define it as a positive integer number. It defines how offen would you like to see summary output in terminal. For instance, number 100 mean that network shows summary at 100th, 200th, 300th ... epochs.
  • String defines number of times you want to see output in terminal. For instance, value '2 times' mean that the network will show output twice with approximately equal period of epochs and one additional output would be after the finall epoch.

Defaults to 1.

shuffle_data : bool

If it’s True class shuffles all your training data before training your network, defaults to True.

epoch_end_signal : function

Calls this function when train epoch finishes.

train_end_signal : function

Calls this function when train process finishes.

verbose : bool

Property controls verbose output interminal. True enables informative output in the terminal and False - disable it. Defaults to False.

Notes

  • Unlike other algorithms this network doesn’t make predictions. Intead, it learns topological structure of the data in form of the graph. After that training, stucture of the network can be extracted from the graph attribute.
  • In order to speed up training, it might be useful to increase the n_start_nodes parameter.
  • During the training it happens that nodes learn topological structure of one part of the data better than the other, mostly because of the different data sample density in different places. Increasing the min_distance_for_update can speed up training ignoring updates for the neurons that very close to the data sample. (below specified min_distance_for_update value). Training can be stopped in case if none of the neurons has been updated during the training epoch.

References

[1] A Growing Neural Gas Network Learns Topologies, Bernd Fritzke

Examples

>>> from neupy import algorithms
>>> from sklearn.datasets import make_blobs
>>>
>>> data, _ = make_blobs(
...     n_samples=1000,
...     n_features=2,
...     centers=2,
...     cluster_std=0.4,
... )
>>>
>>> neural_gas = algorithms.GrowingNeuralGas(
...     n_inputs=2,
...     shuffle_data=True,
...     verbose=True,
...     max_edge_age=10,
...     n_iter_before_neuron_added=50,
...     max_nodes=100,
... )
>>> neural_gas.graph.n_nodes
100
>>> len(neural_gas.graph.edges)
175
>>> edges = list(neural_gas.graph.edges.keys())
>>> neuron_1, neuron_2 = edges[0]
>>>
>>> neuron_1.weight
array([[-6.77166299,  2.4121606 ]])
>>> neuron_2.weight
array([[-6.829309  ,  2.27839633]])

Attributes

graph (NeuralGasGraph instance) This attribute stores all neurons and connections between them in the form of undirected graph.
errors (ErrorHistoryList) Contains list of training errors. This object has the same properties as list and in addition there are three additional useful methods: last, previous and normalized.
train_errors (ErrorHistoryList) Alias to the errors attribute.
validation_errors (ErrorHistoryList) The same as errors attribute, but it contains only validation errors.
last_epoch (int) Value equals to the last trained epoch. After initialization it is equal to 0.

Methods

train(input_train, summary=’table’, epochs=100) Network learns topological structure of the data. Learned topolog is stored in the graph attribute.
fit(*args, **kwargs) Alias to the train method.
initialize_nodes(data) Network initializes nodes randomly sampling n_start_nodes from the data. It would be applied automatically before the training in case if graph is empty. Note: Node re-initialization can reset network.
after_split_error_decay_rate = None[source]
error_decay_rate = None[source]
format_input_data(input_data)[source]
initialize_nodes(data)[source]
max_edge_age = None[source]
max_nodes = None[source]
min_distance_for_update = None[source]
n_inputs = None[source]
n_iter_before_neuron_added = None[source]
n_start_nodes = None[source]
neighbour_step = None[source]
options = {'verbose': Option(class_name='Verbose', value=VerboseProperty(name="verbose")), 'step': Option(class_name='GrowingNeuralGas', value=NumberProperty(name="step")), 'show_epoch': Option(class_name='BaseNetwork', value=ShowEpochProperty(name="show_epoch")), 'shuffle_data': Option(class_name='BaseNetwork', value=Property(name="shuffle_data")), 'epoch_end_signal': Option(class_name='BaseNetwork', value=Property(name="epoch_end_signal")), 'train_end_signal': Option(class_name='BaseNetwork', value=Property(name="train_end_signal")), 'n_inputs': Option(class_name='GrowingNeuralGas', value=IntProperty(name="n_inputs")), 'n_start_nodes': Option(class_name='GrowingNeuralGas', value=IntProperty(name="n_start_nodes")), 'neighbour_step': Option(class_name='GrowingNeuralGas', value=NumberProperty(name="neighbour_step")), 'max_edge_age': Option(class_name='GrowingNeuralGas', value=IntProperty(name="max_edge_age")), 'max_nodes': Option(class_name='GrowingNeuralGas', value=IntProperty(name="max_nodes")), 'n_iter_before_neuron_added': Option(class_name='GrowingNeuralGas', value=IntProperty(name="n_iter_before_neuron_added")), 'after_split_error_decay_rate': Option(class_name='GrowingNeuralGas', value=ProperFractionProperty(name="after_split_error_decay_rate")), 'error_decay_rate': Option(class_name='GrowingNeuralGas', value=ProperFractionProperty(name="error_decay_rate")), 'min_distance_for_update': Option(class_name='GrowingNeuralGas', value=NumberProperty(name="min_distance_for_update"))}[source]
predict(*args, **kwargs)[source]

Return prediction results for the input data.

Parameters:input_data : array-like
Returns:array-like
step = None[source]
train(input_train, summary='table', epochs=100)[source]

Method train neural network.

Parameters:

input_train : array-like

target_train : array-like or None

input_test : array-like or None

target_test : array-like or None

epochs : int

Defaults to 100.

epsilon : float or None

Defaults to None.

train_epoch(input_train, target_train=None)[source]
class neupy.algorithms.competitive.growing_neural_gas.NeuralGasGraph[source]

Undirected graph structure that stores neural gas network’s neurons and connections between them.

Attributes

edges_per_node (dict) Dictionary that where key is a unique node and value is a list of nodes that connection to this edge.
edges (dict) Dictonary that stores age per each connection. Ech key will have the following format: (node_1, node_2).
nodes (list) List of all nodes in the graph.
n_nodes (int) Number of nodes in the network.
add_edge(node_1, node_2)[source]
add_node(node)[source]
n_nodes[source]
nodes[source]
remove_edge(node_1, node_2)[source]
remove_node(node)[source]
class neupy.algorithms.competitive.growing_neural_gas.NeuronNode[source]

Structure representes neuron in the Neural Gas algorithm.

Attributes

weight (2d-array) Neuron’s position in the space.
error (float) Error accumulated during the training.