Utilities
You can use additional utilities to help you with your development. These are described below.
- class Hmile.utils.DataTensorer(dataprovider: DataProvider, nb_env, nb_data_per_session, device, mean_window_size=300, window_size=-1)
Can be used to transform data into pytorch tensor
- get_indicators()
Returns normalized and unnormalized data
- Returns
(unnormalized data : torch.Tensor, normalized data : torch.Tensor)
- Return type
tuple
- get_max_indices() list
Returns the max for each feature
- Returns
result
- Return type
list
- get_min_indices() list
Returns the min for each feature
- Returns
result
- Return type
list
- normalize(data)
Apply rolling mean and variance normalization. The result is clipped between -10 and 10
- Parameters
data (torch.tensor) – tensor to normalize
- reset()
Reset the current step for all indices
- reset_by_id(indices)
Reset the current step for the given indices
- Parameters
indices (list) – list of indices to reset
- class Hmile.utils.AE(norm: dict, column_names: list, input_shape: int, nb_neurones: list, normalize_output: bool = False)
Create an Autoencoder. this autoencoder contains severals interesting attributes :
the encoder (can be used alone by self.encoder.forward(…))
the decoder (can be used alone by self.decoder.forward(…))
the norm of the data used (for normalization) with self.norm (gives dict with mean/std for each pairs)
the name of the columns of the dataset with self.column_names
- forward(features)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- Hmile.utils.trainAE(pairs: dict, nb_out_components: int = 60, is_normalized: bool = False, display: bool = True, test_percent: float = 0.1, architecture: list = [200, 150, 100], normalize_output: bool = False, nb_epoch: int = 200, lr: float = 0.0001, batch_size: int = 128, test_batch_size: int = 32) AE
function to train an autoencoder
- Parameters
pairs (dict) – dict of pairs dataframe on which the autoencoder will be trained.
nb_out_components (int, optional) – final number of features. Defaults to 40.
is_normalized (bool, optional) – specify if dataset is already normalized or need to be. Defaults to False.
display (bool, optional) – display results of the training. Defaults to True.
test_percent (float, optional) – part of the dataset kept for test. Defaults to 0.1.
architecture (list, optional) – architecture of the encoder (decoder is the symetric). Defaults to [200,150,100].
nb_epoch (int, optional) – nb epoch for learning. Defaults to 200.
lr (float, optional) – learning rate. Defaults to 1e-4.
batch_size (int, optional) – batch size of train dataset. Defaults to 128.
test_batch_size (int, optional) – batch size of test dataset. Defaults to 32.
- Returns
return the full autoencoder with mean and std of the dataset used and name of the columns
- Return type
autoencoder