fairly package

Subpackages

Submodules

fairly.diff module

Diff class module.

Diff class is used to keep track of dataset modifications.

Usage example:

>>> diff = Diff()
>>> diff.modify("name", "Johnny", "John")
>>> diff.modified
    {"name": ("Johnny", "John")}
class fairly.diff.Diff[source]

Bases: object

_added

Items added

Type:

Dict

_modified

Items modified

Type:

Dict

_removed

Items removed

Type:

Dict

add(key, val) None[source]

Appends an item to the diff set as added.

Parameters:
  • key – Item key

  • val – Item value

property added: Dict

Returns a dictionary of added items.

property modified: Dict

Returns a dictionary of modified items.

modify(key, val, oldval) None[source]

Appends an item to the diff set as modified.

Parameters:
  • key – Item key

  • val – Item value

  • oldVal – Old value of the item

remove(key, val) None[source]

Appends an item to the diff set as removed.

Parameters:
  • key – Item key

  • val – Item value

property removed: Dict

Returns a dictionary of removed items.

fairly.metadata module

Metadata class module.

Metadata class is used to store metadata attributes in a standardized manner.

Usage example:

>>> metadata = Metadata({"title": "Title", "DOI": "doi:xxx"})
>>> metadata["authors"] = ["Doe, John"]
class fairly.metadata.Metadata(normalize: Callable = None, serialize: Callable = None, **kwargs)[source]

Bases: MutableMapping

Metadata class.

_attrs

Metadata attributes.

Type:

Dict

_basis

Basis of metadata attributes.

Type:

Dict

_normalize

Attribute normalization method.

Type:

Callable

_serialize

Attribute serialization method.

Type:

Callable

Class Attributes:

REGEXP_DOI: Regular expression to validate DOI.

REGEXP_DOI = re.compile('10\\.\\d{4,9}/[-._;()/:a-z\\d]+', re.IGNORECASE)
autocomplete(overwrite: bool = False, attrs: List = None, **kwargs) Dict[source]

Completes missing metadata attributes by using the available information.

Supported attributes:
  • Any attribute with a data type of Person.

  • Any attribute with a data type of PersonList.

Parameters:
  • overwrite (bool) – Set True to overwrite existing attributes (default False).

  • attrs (List) – List of attributes to be completed (optional).

  • **kwargs – Arguments for the specific autocomplete methods.

Returns:

A dictionary of attributes set by method.

property is_modified: bool

Checks if metadata is modified.

Returns:

True is metadata is modified, False otherwise.

classmethod normalize_value(key: str, val) Any[source]

Normalizes metadata attribute value.

Supported attributes:
  • doi

  • keywords

  • authors

Parameters:
  • key (str) – Attribute key.

  • val – Attribute value.

Returns:

Normalized attribute value.

Raises:

ValueError – If invalid attribute value.

print() None[source]

Pretty prints metadata.

Serializes metadata and prints as YAML without comments.

rebase() None[source]

Updates the basis of the metadata attributes.

serialize() Dict[source]

Serializes metadata as a dictionary.

Returns:

Metadata dictionary.

classmethod serialize_value(key: str, val) Any[source]

Serializes metadata attribute value.

Supported attributes:
  • Any attribute with a data type of Person.

  • Any attribute with a data type of PersonList.

Parameters:
  • key (str) – Attribute key.

  • val – Attribute value.

Returns:

Serialized attribute value.

fairly.person module

Person class module.

Person class is used to store person (e.g. author) information in a standardized manner.

Usage example:

>>> person = Person("Doe, John")
>>> person = Person(fullname="Doe, Jon", orcid_id="xxx")
>>> person.affiliation = "fairly Community"
class fairly.person.Person(person: str = None, **kwargs)[source]

Bases: MutableMapping

Class to handle person information, e.g. for authors, contributors, etc.

Class Attributes:

REGEXP_ORCID_ID: Regular expression to validate ORCID identifier. REGEXP_EMAIL: Regular expression to validate e-mail address.

REGEXP_EMAIL = re.compile('[\\w\\.+-]+@([\\w-]+\\.)+[\\w-]{2,}')
REGEXP_ORCID_ID = re.compile('(\\d{4}-){3}\\d{3}(\\d|X)')
autocomplete(overwrite: bool = False, orcid_token: str = None) Dict[source]

Completes missing information by using the ORCID identifier.

Parameters:

overwrite – If True existing attributes are overwritten.

Returns:

A dictionary of attributes set by method.

static from_orcid_id(orcid_id: str, token: str = None) Person[source]

Retrieves person information from ORCID identifier.

If not specified, token is read from fairly configuration. If it is also not available, it is retrieved by using get_orcid_token() method.

Parameters:
  • orcid_id – ORCID identifier.

  • token – ORCID access token.

Returns:

Person object if valid ORCID identifier, None otherwise.

Raises:
  • ValueError("No access token") – If access token is not available.

  • ValueError("Invalid ORCID identifier") – If ORCID identified is not valid.

static get_orcid_token(client_id: str = None, client_secret: str = None) str[source]

Retrieves ORCID access token by using ORCID client id and secret.

ORCID access token is required to retrieve person information by using an ORCID ID.

If not specified, client_id and client_secret are read from fairly configuration.

Parameters:
  • client_id – ORCID client id.

  • client_secret – ORCID client secret.

Returns:

ORCID access token.

Raises:
  • ValueError("No client id") – If client id is not available.

  • ValueError("No client secret") – If client secret is not available.

  • ValueError("Invalid response") – If access token is not retrieved.

static get_persons(people) List[Person][source]

Returns standard person list from the people argument.

A string or an iterable are accepted as input. If input is a string, it is split using semicolon and line feed as separators. For the items of the iterable, the following are performed:

  • If it is a Person object, a copy is created.

  • If it is a string, it is parsed to a dictionary using parse().

  • If is is a dictionary, Person object is created.

Parameters:

people – People argument.

Returns:

List of person objects.

Raises:

ValueError – If people argument is invalid.

classmethod parse(person: str) Dict[source]

Parses person identifier and extracts available person attributes.

The following attributes might be extracted:
  • name

  • surname

  • fullname

  • orcid_id

Parameters:

person – Person identifier (e.g. fullname)

Returns:

Dictionary of person attributes.

serialize() Dict[source]

Serializes person as a dictionary.

Returns:

Person dictionary.

class fairly.person.PersonList(iterable=None)[source]

Bases: list

append(item)[source]

Append object to the end of the list.

extend(other)[source]

Extend list by appending elements from the iterable.

insert(index, item)[source]

Insert object before index.

Module contents

fairly

fairly.client(id: str, **kwargs) Client[source]

Creates client object from a client or repository identifier.

Identifier is first checked within recognized repository identifiers. If no match is found, it is regarded as a client identifier. Additional client arguments (e.g. API URL address) might be necessary for the later.

Parameters:
  • id (str) – Client or repository identifier.

  • **kwargs – Other client arguments.

Returns:

Client object.

Raises:

ValueError("Invalid client id") – If invalid client id.

Examples

>>> # Create a 4TU.ResearchData client (id = "4tu")
>>> client = fairly.client("4tu")
>>> # Create a Figshare client with a custom URL address
>>> client = fairly.client("figshare", url="https://data.4tu.nl/")
fairly.dataset(id: str) Dataset[source]

Creates dataset object from a dataset identifier.

The following types of dataset identifiers are supported:
  • DOI : Digital object identifier of a remote dataset.

  • URL : URL address of a remote dataset.

  • Path : Path of a local dataset.

Repository of the dataset is automatically detected by checking the URL addresses and the DOI prefixes of the recognized repositories.

Parameters:

id (str) – Dataset identifier.

Returns:

Dataset object.

Raises:

ValueError("Unknown dataset identifier") – If unknown dataset identifier.

Examples

>>> dataset = fairly.dataset("10.5281/zenodo.6026285")
>>> dataset = fairly.dataset("https://zenodo.org/records/6026285")
fairly.debug(state: bool = True) None[source]
fairly.get_clients() Dict[source]

Returns available clients.

Returns:

Dictionary of the available clients. Keys are client identifiers (str), values are client classes (Client).

Raises:

AttributeError("Invalid client module", id) – If a client module is invalid.

Examples

>>> fairly.get_clients()
>>> {'figshare': <class 'fairly.client.figshare.FigshareClient'>, ...}
fairly.get_config(key: str) Dict[source]

Returns configuration parameters for the specified key.

Configuration parameters are read from the following sources:

  1. Configuration file of the package located at {package_root}/data/config.json

  2. Configuration file of the user located at ~/.fairly/config.json.

  3. Environmental variables of the user starting with FAIRLY_{KEY}_.

Parameters:

key (str) – Configuration key.

Returns:

Dictionary of configuration parameters for the specified key.

Examples

>>> fairly.get_config("fairly")
>>> {'orcid_client_id': 'id', 'orcid_client_secret': 'secret', ...}
fairly.get_environment_config(key: str) Dict[source]

Returns configuration parameters for the specified key from environmental variables.

Parameters:

key (str) – Configuration key.

Returns:

Dictionary of configuration parameters for the specified key.

Examples

>>> fairly.get_environment_config("fairly")
>>> {'orcid_client_id': 'id', ...}
fairly.get_repositories() Dict[source]

Returns recognized repositories.

Returns:

Dictionary of the recognized repositories. Keys are repository identifiers (str), values are repository dictionaries (Dict).

Raises:
  • ValueError – If configuration is invalid.

  • AttributeError – If a repository has no client id.

  • AttributeError – If a repository has invalid client id.

Examples

>>> fairly.get_repositories()
>>> {'4tu': {'client_id': 'figshare', 'name': '4TU.ResearchData', 'url': 'https://data.4tu.nl/', ...}, ...}
fairly.get_repository(uid: str) Dict[source]

Returns repository dictionary of the specified repository.

Parameters:

uid (str) – Repository id or URL address.

Returns:

Repository dictionary if a recognized repository, None otherwise.

Examples

>>> fairly.get_repository("4tu")
>>> {'id': '4tu', 'client_id': 'figshare', 'name': '4TU.ResearchData', 'url': 'https://data.4tu.nl/', ...}
>>> fairly.get_repository("5tu")
>>>
fairly.init_dataset(path: str, template: str = 'default', create: bool = True) LocalDataset[source]

Initializes a local dataset.

Parameters:
  • path (str) – Local path of the dataset.

  • template – Template of the dataset (default = ‘default’).

  • create – Set True to create the dataset directory if not exists (default = True)

Returns:

Local dataset object

Raises:
  • ValueError("Invalid path") – If path is invalid.

  • NotADirectoryError – If path is not a directory path.

  • ValueError("Operation not permitted") – If path is an existing dataset path.

  • ValueError("Invalid template name") – If template name is invalid.

fairly.is_testing() bool[source]

Returns unit testing state.

Returns:

True if performing unit tests, False otherwise

fairly.max_workers() int[source]

Returns maximum number of workers for file operations.

fairly.metadata_templates() List[source]

Returns list of available metadata templates.

Returns:

List of available metadata templates (str).

Examples

>>> fairly.metadata_templates()
>>> ['default', 'zenodo', 'figshare']
fairly.notify(file: File, current_size: int, total_size: int = None, current_total_size: int = None) None[source]

Displays file transfer information.

Parameters:
  • file (File) – File object.

  • current_size (int) – Current size of the file.

  • total_size (int) – Total size of the file (optional).

  • current_total_size (int) – Current total size of the transfer operation (optional).

fairly.resolveDOI(doi: str) str[source]

Returns URL address to a DOI.

Parameters:

doi (str) – Digital object identifier

Returns:

URL address of the DOI.

Raises:

ValueError("Invalid DOI") – If DOI is invalid.

fairly.set_max_workers(num: int = None, force: bool = False) int[source]

Sets number of maximum workers for file operations.

Maximum number of workers is limited to MAX_WORKERS, unless force flag is set.

Parameters:
  • num (int) – Maximum number of workers for file operations.

  • force (bool) – Set True to increase the number beyond MAX_WORKERS (default False).

Returns:

Maximum number of workers for file operations.

Raises:

ValueError("Invalid maximum number of workers") – If the number is more than the number of available cores.

fairly.store(id: str, path: str = None, notify: Callable = None, extract: bool = False) LocalDataset[source]

Stores remote dataset locally

Parameters:
  • id (str) – Dataset identifier.

  • path (str) – Local path to store the dataset (optional).

  • notify (Callable) – Notification callback function.

  • extract (bool) – Set True to extract dataset archives (default = False)

Returns:

Local dataset object