DataHandler

class mouffet.data.data_handler.DataHandler(opts)[source]

Bases: object

A class that handles all data related business. While this class provides convenience functions, this should be subclassed.

Option name

Description

Default

Type

generate_file_lists

Should file lists be regenerated

False

bool

data_by_type

Is the database split by type

False

bool

duplicate_database(database)[source]

Checks in the database list the database whose name is similar to the database argument. Then duplicates it and updates any options contained in database

Parameters:

database (instance of DataHandler.OPTIONS_CLASS) – The database to duplicate

Returns:

The duplicated database

Return type:

mouffet.options.DatabaseOptions

load_databases()[source]

Loads all databases defined in the ‘databases’ option of the configuration file.

Returns:

A dict where keys are the names of the databases and values are instances of the DataHandler.OPTIONS_CLASS that must be a subclass of mouffet.options.DatabaseOptions

Return type:

dict

load_datasets(db_type, databases=None, by_dataset=False, **kwargs)[source]

Load a dataset of type db_type. Can also prepare the dataset if the prepare argument is True. The user can provide a preparation function via prepare_func but by default will try to call a function named prepare_`db_type`_dataset (e.g. prepare_training_dataset) and then the generic prepare_dataset method.

Parameters:
  • db_type (_type_) – _description_

  • databases (_type_, optional) – _description_. Defaults to None.

  • by_dataset (bool, optional) – _description_. Defaults to False.

  • load_opts (_type_, optional) – _description_. Defaults to None.

  • prepare (bool, optional) – _description_. Defaults to False.

  • prepare_func (_type_, optional) – _description_. Defaults to None.

  • prepare_opts (_type_, optional) – _description_. Defaults to None.

Returns:

_description_

Return type:

_type_

prepare_dataset(dataset, opts)[source]

_summary_

Parameters:
  • dataset (_type_) – _description_

  • opts (_type_) – _description_

Returns:

_description_

Return type:

_type_

update_database(new_opts=None, name='', copy=True)[source]

Updates a database with the options contained in new_opts. If ‘name’ is not provided, this function tries to get the name of the database to update from the ‘name’ key in new_opts.

Parameters:
  • new_opts (dict, optional) – A dictionary containing the new value to update.

  • None. (Defaults to) –

  • name (str, optional) – The name of the database to update. Defaults to “”.

  • copy (bool, optional) – If True, returns a copy of the original database.

  • True. (Defaults to) –

Raises:

AttributeError – Thrown when no database ‘name’ has been found.

Returns:

An options object with the values of the original database with updated values. Returns None if the database name was not found.

Return type:

DataHandler.OPTIONS_CLASS