neupy.algorithms.competitive.growing_neural_gas module
- class neupy.algorithms.competitive.growing_neural_gas.GrowingNeuralGas[source]
Growing Neural Gas (GNG) algorithm.
Current algorithm has two modifications that hasn’t been mentioned in the paper, but they help to speed up training.
- The n_start_nodes parameter provides possibility to increase number of nodes during initialization step. It’s useful when algorithm takes a lot of time building up large amount of neurons.
- The min_distance_for_update parameter allows to speed up training when some data samples has neurons very close to them. The min_distance_for_update parameter controls threshold for the minimum distance for which we will want to update weights.
Parameters: - n_inputs : int
Number of features in each sample.
- n_start_nodes : int
Number of nodes that algorithm generates from the data during the initialization step. Defaults to 2.
- step : float
Step (learning rate) for the neuron winner. Defaults to 0.2.
- neighbour_step : float
Step (learning rate) for the neurons that connected via edges with neuron winner. This value typically has to be smaller than step value. Defaults to 0.05.
- max_edge_age : int
It means that if edge won’t be updated for max_edge_age iterations than it would be removed. The larger the value the more updates we allow to do before removing edge. Defaults to 100.
- n_iter_before_neuron_added : int
Each n_iter_before_neuron_added weight update algorithm add new neuron. The smaller the value the more frequently algorithm adds new neurons to the network. Defaults to 1000.
- error_decay_rate : float
This error decay rate would be applied to every neuron in the graph after each training iteration. It ensures that old errors will be reduced over time. Defaults to 0.995.
- after_split_error_decay_rate : float
This decay rate reduces error for neurons with largest errors after algorithm added new neuron. This value typically lower than error_decay_rate. Defaults to 0.5.
- max_nodes : int
Maximum number of nodes that would be generated during the training. This parameter won’t stop training when maximum number of nodes will be exceeded. Defaults to 1000.
- min_distance_for_update : float
Parameter controls for which neurons we want to apply updates. In case if euclidean distance between data sample and closest neurons will be less than the min_distance_for_update value than update would be skipped for this data sample. Setting value to zero will disable effect provided by this parameter. Defaults to 0.
- show_epoch : int
This property controls how often the network will display information about training. It has to be defined as positive integer. For instance, number 100 mean that network shows summary at 1st, 100th, 200th, 300th … and last epochs.
Defaults to 1.
- shuffle_data : bool
If it’s True than training data will be shuffled before the training. Defaults to True.
- signals : dict, list or function
Function that will be triggered after certain events during the training.
- verbose : bool
Property controls verbose output in terminal. The True value enables informative output in the terminal and False - disable it. Defaults to False.
Notes
- Unlike other algorithms this network doesn’t make predictions. Instead, it learns topological structure of the data in form of the graph. After that training, structure of the network can be extracted from the graph attribute.
- In order to speed up training, it might be useful to increase the n_start_nodes parameter.
- During the training it happens that nodes learn topological structure of one part of the data better than the other, mostly because of the different data sample density in different places. Increasing the min_distance_for_update can speed up training ignoring updates for the neurons that very close to the data sample. (below specified min_distance_for_update value). Training can be stopped in case if none of the neurons has been updated during the training epoch.
References
[1] A Growing Neural Gas Network Learns Topologies, Bernd Fritzke
Examples
>>> from neupy import algorithms >>> from sklearn.datasets import make_blobs >>> >>> data, _ = make_blobs( ... n_samples=1000, ... n_features=2, ... centers=2, ... cluster_std=0.4, ... ) >>> >>> neural_gas = algorithms.GrowingNeuralGas( ... n_inputs=2, ... shuffle_data=True, ... verbose=True, ... max_edge_age=10, ... n_iter_before_neuron_added=50, ... max_nodes=100, ... ) >>> neural_gas.graph.n_nodes 100 >>> len(neural_gas.graph.edges) 175 >>> edges = list(neural_gas.graph.edges.keys()) >>> neuron_1, neuron_2 = edges[0] >>> >>> neuron_1.weight array([[-6.77166299, 2.4121606 ]]) >>> neuron_2.weight array([[-6.829309 , 2.27839633]])
Attributes: - graph : NeuralGasGraph instance
This attribute stores all neurons and connections between them in the form of undirected graph.
- errors : list
Information about errors. It has two main attributes, namely train and valid. These attributes provide access to the training and validation errors respectively.
- last_epoch : int
Value equals to the last trained epoch. After initialization it is equal to 0.
- n_updates_made : int
Number of training updates applied to the network.
Methods
train(X_train, epochs=100) Network learns topological structure of the data. Learned structure will be stored in the graph attribute. fit(*args, **kwargs) Alias to the train method. initialize_nodes(data) Network initializes nodes randomly sampling n_start_nodes from the data. It would be applied automatically before the training in case if graph is empty. Note: Node re-initialization can reset network. - after_split_error_decay_rate = None[source]
- error_decay_rate = None[source]
- format_input_data(X)[source]
- initialize_nodes(data)[source]
- max_edge_age = None[source]
- max_nodes = None[source]
- min_distance_for_update = None[source]
- n_inputs = None[source]
- n_iter_before_neuron_added = None[source]
- n_start_nodes = None[source]
- neighbour_step = None[source]
- one_training_update(X_train, y_train=None)[source]
Function would be trigger before run all training procedure related to the current epoch.
Parameters: - epoch : int
Current epoch number.
- options = {'after_split_error_decay_rate': Option(class_name='GrowingNeuralGas', value=ProperFractionProperty(name="after_split_error_decay_rate")), 'error_decay_rate': Option(class_name='GrowingNeuralGas', value=ProperFractionProperty(name="error_decay_rate")), 'max_edge_age': Option(class_name='GrowingNeuralGas', value=IntProperty(name="max_edge_age")), 'max_nodes': Option(class_name='GrowingNeuralGas', value=IntProperty(name="max_nodes")), 'min_distance_for_update': Option(class_name='GrowingNeuralGas', value=NumberProperty(name="min_distance_for_update")), 'n_inputs': Option(class_name='GrowingNeuralGas', value=IntProperty(name="n_inputs")), 'n_iter_before_neuron_added': Option(class_name='GrowingNeuralGas', value=IntProperty(name="n_iter_before_neuron_added")), 'n_start_nodes': Option(class_name='GrowingNeuralGas', value=IntProperty(name="n_start_nodes")), 'neighbour_step': Option(class_name='GrowingNeuralGas', value=NumberProperty(name="neighbour_step")), 'show_epoch': Option(class_name='BaseNetwork', value=IntProperty(name="show_epoch")), 'shuffle_data': Option(class_name='BaseNetwork', value=Property(name="shuffle_data")), 'signals': Option(class_name='BaseNetwork', value=Property(name="signals")), 'step': Option(class_name='GrowingNeuralGas', value=NumberProperty(name="step")), 'verbose': Option(class_name='Verbose', value=VerboseProperty(name="verbose"))}[source]
- predict(*args, **kwargs)[source]
- step = None[source]
- train(X_train, epochs=100)[source]
Method train neural network.
Parameters: - X_train : array-like
- y_train : array-like or None
- X_test : array-like or None
- y_test : array-like or None
- epochs : int
Defaults to 100.
- epsilon : float or None
Defaults to None.
- class neupy.algorithms.competitive.growing_neural_gas.NeuralGasGraph[source]
Undirected graph structure that stores neural gas network’s neurons and connections between them.
Attributes: - edges_per_node : dict
Dictionary that where key is a unique node and value is a list of nodes that connection to this edge.
- edges : dict
Dictonary that stores age per each connection. Ech key will have the following format: (node_1, node_2).
- nodes : list
List of all nodes in the graph (read-only attribute).
- n_nodes : int
Number of nodes in the network (read-only attribute).
- n_edges : int
Number of edges in the network (read-only attribute).
- add_edge(node_1, node_2)[source]
- add_node(node)[source]
- find_edge_id(node_1, node_2)[source]
- n_edges[source]
- n_nodes[source]
- nodes[source]
- remove_edge(node_1, node_2)[source]
- remove_node(node)[source]
- reset_edge(node_1, node_2)[source]
- class neupy.algorithms.competitive.growing_neural_gas.NeuronNode[source]
Structure representes neuron in the Neural Gas algorithm.
Attributes: - weight : 2d-array
Neuron’s position in the space.
- error : float
Error accumulated during the training.