DataLoader

class mouffet.data.data_loader.DataLoader(structure)[source]

Basic class for loading raw data into the dataset. By default, only the method load_dataset() is called by the data_handler.DataHandler instance during the dataset generation call (by data_handler.DataHandler.generate_datasets()). A basic implementation of load_dataset() is provided, however this method calls two other methods, load_data_options() and load_file_data(), that should be overriden since they do nothing by default.

finalize_dataset(missing)[source]

Callback function called after data generation is finished but before it is saved in case some further action must be done after all files are loaded (e.g. dataframe concatenation)

generate_dataset(database, paths, file_list, db_type, missing=None, overwrite=False)[source]

[summary]

Parameters:
  • database ([type]) – [description]

  • paths ([type]) – [description]

  • file_list ([type]) – [description]

  • db_type ([type]) – [description]

  • overwrite ([type]) – [description]

load_file_data(*args, **kwargs)[source]

Load data for the file at file_path. This usually include loading the raw data and the tags associated with the file. This method should then fill the tmp_db_data attribute to save the intermediate results

Parameters:
  • file_path ([type]) – [description]

  • tags_dir ([type]) – [description]

  • opts ([type]) – [description]