neupy.datasets.make_reber_classification

neupy.datasets.make_reber_classification(n_samples, invalid_size=0.5, lenrange=(3, 14), return_indices=False)[source]

Generate random dataset for Reber grammar classification. Invalid words contains the same letters as at Reber grammar, but they are build without grammar rules.

Parameters:
n_samples : int

Number of samples in dataset.

invalid_size : float

Proportion of invalid words in dataset, defaults to 0.5. Value must be between 0 and 1.

lenrange : tuple

Length of each word will be bounded by the two numbers specified in this range. Defaults to (3, 14).

return_indices : bool

If True, each word will be converted to array where each letter converted to the index. Defaults to False.

Returns:
tuple

Return two lists. First contains words and second - labels for them.

Examples

>>> from neupy.datasets import make_reber_classification
>>>
>>> data, labels = make_reber_classification(10, invalid_size=0.5)
>>> data
array(['SXSXVSXXVX', 'VVPS', 'VVPSXTTS', 'VVS', 'VXVS', 'VVS',
       'PPTTTXPSPTV', 'VTTSXVPTXVXT', 'VSSXSTX', 'TTXVS'],
      dtype='<U12')
>>> labels
array([0, 1, 0, 1, 1, 1, 0, 0, 0, 1])
>>>
>>> data, labels = make_reber_classification(
...     4, invalid_size=0.5, return_indices=True)
>>> data
array([array([1, 3, 1, 4]),
       array([0, 3, 0, 3, 0, 4, 3, 0, 4, 4]),
       array([1, 3, 1, 2, 3, 1, 2, 4]),
       array([0, 3, 0, 0, 3, 0, 4, 2, 4, 1, 0, 4, 0])], dtype=object)