DataHandler

class mouffet.data.data_handler.DataHandler(opts)[source]

A class that handles all data related business. While this class provides convenience functions, this should be subclassed.

Option name

Description

Default

Type

generate_file_lists

Should file lists be regenerated

False

bool

data_by_type

Is the database split by type

False

bool

duplicate_database(database)[source]

Duplicates the provided database

Parameters

database (instance of DataHandler.OPTIONS_CLASS) – The database to duplicate

Returns

The duplicated database

Return type

mouffet.options.DatabaseOptions

load_databases()[source]

Loads all databases defined in the ‘databases’ option of the configuration file.

Returns

A dict where keys are the names of the databases and values are instances of the DataHandler.OPTIONS_CLASS that must be a subclass of mouffet.options.DatabaseOptions

Return type

dict

load_datasets(db_type, databases=None, by_dataset=False, load_opts=None, prepare=False, prepare_func=None, prepare_opts=None)[source]

Load a dataset of type db_type. Can also prepare the dataset if the prepare argument is True. The user can provide a preparation function via prepare_func but by default will try to call a function named prepare_`db_type`_dataset (e.g. prepare_training_dataset) and then the generic prepare_dataset method.

Parameters
  • db_type (_type_) – _description_

  • databases (_type_, optional) – _description_. Defaults to None.

  • by_dataset (bool, optional) – _description_. Defaults to False.

  • load_opts (_type_, optional) – _description_. Defaults to None.

  • prepare (bool, optional) – _description_. Defaults to False.

  • prepare_func (_type_, optional) – _description_. Defaults to None.

  • prepare_opts (_type_, optional) – _description_. Defaults to None.

Returns

_description_

Return type

_type_

update_database(new_opts=None, name='', copy=True)[source]

Updates a database with the options contained in new_opts. If ‘name’ is not provided, this function tries to get the name of the database to update from the ‘name’ key in new_opts.

Parameters
  • new_opts (dict, optional) – A dictionary containing the new value to update.

  • None. (Defaults to) –

  • name (str, optional) – The name of the database to update. Defaults to “”.

  • copy (bool, optional) – If True, returns a copy of the original database.

  • True. (Defaults to) –

Raises

AttributeError – Thrown when no database ‘name’ has been found.

Returns

An options object with the values of the original database with updated values. Returns None if the database name was not found.

Return type

DataHandler.OPTIONS_CLASS