Build NeurEco Discrete Dynamic model with the command line interface#

To build a NeurEco Regression model, run the following command in the terminal:

neurecoRNN build path/to/build/configuration/file/build.conf

The skeleton of a configuration file required to build NeurEco Regression model, here build.conf, looks as follows. Its fields should be filled according to the problem at hand.

 1 {
 2 "neurecoRNN_build":
 3     {
 4         "exc_filenames": [],
 5         "output_filenames": [],
 6         "validation_exc_filenames": [],
 7         "validation_output_filenames": [],
 8         "test_exc_filenames": [],
 9         "test_output_filenames": [],
10         "write_model_to": "",
11         "write_model_output_to_directory": "",
12         "checkpoint_address": "",
13         "resume": False,
14         "settings": {
15             "valid_percentage": 30,
16         "min_hidden_state": 1,
17         "max_hidden_state": 0,
18         "steady_state_exc": [],
19         "steady_state_out": [],
20             "input_normalization": {
21                 "shift_type": "mean",
22                 "scale_type": "l2",
23                 "normalize_per_feature": true},
24             "output_normalization": {
25                 "shift_type": "mean",
26                 "scale_type": "l2",
27                 "normalize_per_feature": true}}
28     },
29 }
30 }
The available building parameters in the configuration file are described in the following table.
NeurEco Discrete Dynamic building parameters in .conf#

Name

type

description

valid_percentage

float, min=1.0, max=50.0, default=33.33

defines the percentage of the data that will be used as validation data. (NeurEco will automatically choose the best data for validation, to ensure that the created model will have the best fit on unseen data. The modification of this parameter can be of interest when the data set is small and we have to find a good tradeoff between the learning and the validation sets.). This parameter is ignored if validation_exc_filenames and validation_output_filenames are passed.

input_normalization: shift_type

string, Dynamic default “mean”

This is the method used to shift the input data. For more details, see Data normalization for Discrete Dynamic.

input_normalization: scale_type

string, Dynamic default “l2”

This is the method used to scale the input data. For more details, see Data normalization for Discrete Dynamic.

input_normalization: normalize_per_feature

boolean, Dynamic default True

if True shifting and scaling will be performed on each feature in the inputs separately, and if False all the features will be normalized together. For example, if the data is the output of an SVD operation, the scale between the coefficients needs to be maintained, so this field should be False. On the other hand, if the inputs represent different fields with different scales (example temperatures that varies from 260 to 300 degrees, and pressure that varies from 1e5 to 1.1e5 Pascal) should not be scaled together. In this case this field should be True.. For more details, see Data normalization for Discrete Dynamic.

output_normalization: shift_type

string, Dynamic default “mean”

This is the method used to shift the output data. For more details, see Data normalization for Discrete Dynamic.

output_normalization: scale_type

string, Dynamic default “l2”

This is the method used to scale the output data. For more details, see Data normalization for Discrete Dynamic.

output_normalization: normalize_per_feature

boolean, Dynamic default True

if True shifting and scaling will be performed on each feature in the outputs separately, and if False all the features will be normalized together. For example, if the data is the output of an SVD operation, the scale between the coefficients needs to be maintained, so this field should be False. On the other hand, if the outputs represent different fields with different scales (example temperatures that varies from 260 to 300 degrees, and pressure that varies from 1e5 to 1.1e5 Pascal) should not be scaled together. In this case this field should be True. For more details, see Data normalization for Discrete Dynamic.

exc_filenames

list of strings, default = []

training data: contains the input data in form of the paths of all the input data files (.conf). The format of the files can be csv, npy or mat (matlab files).

output_filenames

list of strings, default = []

training data: contains the output data in form of the paths of all the output data files.The format of the files can be csv, npy or mat (matlab files).

validation_exc_filenames

list of strings, default = [] (GUI, .conf)

validation data: contains the validation input data table in form of the paths of all the validation input data files. The format of the files can be csv, npy or mat (matlab files).

validation_output_filenames

list of strings, default = [] (GUI, .conf)

validation data: contains the paths of all the validation output data files. The format of the files can be csv, npy or mat (matlab files).

test_exc_filenames

list of strings, default = []

contains the paths of all the testing input data files. The format of the files can be csv, npy or mat (matlab files).

test_output_filenames

list of strings, default = []

contains the paths of all the testing output data files. The format of the files can be csv, npy or mat (matlab files).

write_model_to

string, default = “”

the path where the model will be saved.

checkpoint_address

string, default = “”

the path where the checkpoint model will be saved. The checkpoint model is used for resuming the build of a model, or for choosing an intermediate network with less topological optimization steps.

min_hidden_state

int, default=1

starting number of hidden states to accelerate best topology identification process. the hidden states are the state variables of the system. These are the variables used to describe the mathematical state of the dynamic system (the variables that give enough description about the system to determine its future behavior). NeurEco will identify these variables from the excitations and outputs given in the training, however the user can ensure a faster building process if he gives NeurEco an idea about the minimum and maximum number of these variables. Note that setting the max_hidden_state to 0 means that NeurEco has no constraints over the number of hidden state that can be used to train the model. The minimum number, however, given for min_hidden_state should be 1.

max_hidden_state

int, default=0 (not set)

maximal number of hidden states in the model, this parameter can accelerate best topology identification process.

steady_state_exc

list of floats if using the terminal, numpy 1d array of floats if using the wrapper

forces built model to be stable when fed with this input value. A system is at a steady state if all the parameters of the system are time invariant. Meaning that no matter the condition, if the input variables stay at the same values, the output of the system will not change. For NeurEco, this results in a guarantee that no matter how the system evolves, if the same set of steady_state_excitations are given, the steady_state_outputs will always be returned. Knowing and giving these states is optional, meaning that not giving them does not lead to the model of a poor quality.

steady_state_out

list of floats if using the terminal, numpy 1d array of floats if using the wrapper

stable output value associated to input value steady_state_exc.

Data normalization for Discrete Dynamic#

Set input normalization: normalize_per_feature (or output_normalization: normalize_per_feature) to True if trying to fit the features of different natures (temperature and pressure for example) and want to give them equivalent importance.

Set input_normalization: normalize_per_feature (or output_normalization: normalize_per_feature) to False if trying to fit the features of the same nature (a set of temperatures for example) or a field.

If neither of provided normalization options suits the problem, normalize the data your own way prior to feeding them to NeurEco (and deactivate normalization by setting the scale and shift to none).

A normalization operation for NeurEco is a combination of a \(shift\) and a \(scale\), so that:

\[x_{normalized} = \frac{x-shift}{scale}\]

Allowed shift methods for NeurEco and their corresponding shifted values are listed in the table below:

NeurEco Discrete Dynamic shifting methods#

Name

shift value

none

\[0\]

mean

\[mean(x)\]

Allowed scale methods for NeurEco Tabular and their corresponding scaled values are listed in the table below:

NeurEco Discrete Dynamic scaling methods#

Name

scale value

none

\[1\]

l2

\[\frac{\left\Vert x\right\Vert }{\sqrt{size \_ of \_ x}}\]