Utilities

You can use additional utilities to help you with your development. These are described below.

class Hmile.utils.DataTensorer(dataprovider: DataProvider, nb_env, nb_data_per_session, device, mean_window_size=300, window_size=-1)

Can be used to transform data into pytorch tensor

get_indicators()

Returns normalized and unnormalized data

Returns

(unnormalized data : torch.Tensor, normalized data : torch.Tensor)

Return type

tuple

get_max_indices() list

Returns the max for each feature

Returns

result

Return type

list

get_min_indices() list

Returns the min for each feature

Returns

result

Return type

list

normalize(data)

Apply rolling mean and variance normalization. The result is clipped between -10 and 10

Parameters

data (torch.tensor) – tensor to normalize

reset()

Reset the current step for all indices

reset_by_id(indices)

Reset the current step for the given indices

Parameters

indices (list) – list of indices to reset

class Hmile.utils.AE(norm: dict, column_names: list, input_shape: int, nb_neurones: list, normalize_output: bool = False)

Create an Autoencoder. this autoencoder contains severals interesting attributes :

  • the encoder (can be used alone by self.encoder.forward(…))

  • the decoder (can be used alone by self.decoder.forward(…))

  • the norm of the data used (for normalization) with self.norm (gives dict with mean/std for each pairs)

  • the name of the columns of the dataset with self.column_names

forward(features)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Hmile.utils.trainAE(pairs: dict, nb_out_components: int = 60, is_normalized: bool = False, display: bool = True, test_percent: float = 0.1, architecture: list = [200, 150, 100], normalize_output: bool = False, nb_epoch: int = 200, lr: float = 0.0001, batch_size: int = 128, test_batch_size: int = 32) AE

function to train an autoencoder

Parameters
  • pairs (dict) – dict of pairs dataframe on which the autoencoder will be trained.

  • nb_out_components (int, optional) – final number of features. Defaults to 40.

  • is_normalized (bool, optional) – specify if dataset is already normalized or need to be. Defaults to False.

  • display (bool, optional) – display results of the training. Defaults to True.

  • test_percent (float, optional) – part of the dataset kept for test. Defaults to 0.1.

  • architecture (list, optional) – architecture of the encoder (decoder is the symetric). Defaults to [200,150,100].

  • nb_epoch (int, optional) – nb epoch for learning. Defaults to 200.

  • lr (float, optional) – learning rate. Defaults to 1e-4.

  • batch_size (int, optional) – batch size of train dataset. Defaults to 128.

  • test_batch_size (int, optional) – batch size of test dataset. Defaults to 32.

Returns

return the full autoencoder with mean and std of the dataset used and name of the columns

Return type

autoencoder