fairly.client package
Submodules
fairly.client.djehuty module
- class fairly.client.djehuty.DjehutyClient(repository_id: str = None, **kwargs)[source]
Bases:
Client
4TU.ResearchData is using custom_fields to store the following information:
Contributors
Data Link
Derived From
Format
Geolocation Latitude
Geolocation Longitude
Geolocation
Language
Licence remarks
Organizations
Publisher
Same As
Time coverage
From the developers:
> The use of “Data Link” is inconsistent, so try to avoid using it. In > djehuty, we will assign “Data Link” values as “file links” where > applicable (so they show up under “files”). I think also “Organizations” > and “Publisher” are not entirely consistent and will have gone through > manual cleanup once djehuty goes live.
- LOCKED_SLEEP = 5
- LOCKED_TRIES = 5
- PAGE_SIZE = 25
- REGEXP_UUID = re.compile('([a-f\\d]+)(-[a-f\\d]+)+', re.IGNORECASE)
- property categories: Dict
- classmethod get_config_parameters() Dict [source]
Returns configuration parameters.
- Returns:
Dictionary of configuration parameters. Keys are the parameter names, values are the descriptions.
- get_details(id: Dict) Dict [source]
Returns standard details of the specified dataset.
- Details dictionary:
title (str): Title
url (str): URL address
doi (str): DOI
status (str): Status
size (int): Total size of data files in bytes
created (datetime.datetime): Creation date and time
modified (datetime.datetime): Last modification date and time
- Possible statuses are as follows:
“draft”: Dataset is not published yet.
“public”: Dataset is published and is publicly available.
“embargoed”: Dataset is published, but is under embargo.
“restricted”: Dataset is published, but accessible only under certain conditions.
“closed”: Dataset is published, but accessible only by the owners.
“unknown”: Dataset is in an unknown state.
- Parameters:
id (Dict) – Standard dataset id
- Returns:
Details dictionary of the dataset.
- get_files(id: Dict) List[RemoteFile] [source]
- property licenses: Dict
Retrieves list of available licenses
- License dictionary:
id (int): License identifier
name (str): Name of the license
url (str): URL address of the license
- Returns:
List of license dictionaries
- record_type_lookup = {'conference contribution': 'conferencepaper', 'journal contribution': 'article', 'media': 'video'}
- record_types = {'book': 'Book', 'conference contribution': 'Conference Contribution', 'dataset': 'Dataset', 'figure': 'Figure', 'journal contribution': 'Journal Contribution', 'media': 'Media', 'online resource': 'Online Resource', 'poster': 'Poster', 'preprint': 'Preprint', 'presentation': 'Presentation', 'software': 'Software', 'thesis': 'Thesis'}
fairly.client.invenio module
- class fairly.client.invenio.InvenioClient(repository_id: str = None, **kwargs)[source]
Bases:
Client
- Class Attributes:
PAGE_SIZE (int): Default page size. KEEP_ALIVE (int): Keep alive seconds.
- _details
Record details cache.
- Type:
Dict
- KEEP_ALIVE = 10
- PAGE_SIZE = 100
- classmethod get_client(url: str) Client [source]
Creates a repository client from the specified URL address.
- Parameters:
url (str) – URL address of the repository or dataset.
- Returns:
Client object (InvenioClient).
- Raises:
ValueError("Invalid repository") – If repository is not valid.
- classmethod get_config_parameters() Dict [source]
Returns configuration parameters.
- Returns:
Dictionary of configuration parameters. Keys are the parameter names, values are the descriptions.
- get_details(id: Dict) Dict [source]
Returns standard details of the specified dataset.
- Details dictionary:
title (str): Title
url (str): URL address
doi (str): DOI
status (str): Status
size (int): Total size of data files in bytes (optional)
created (datetime.datetime): Creation date and time
modified (datetime.datetime): Last modification date and time
- Possible statuses are as follows:
“draft”: Dataset is not published yet.
“public”: Dataset is published and is publicly available.
“embargoed”: Dataset is published, but is under embargo.
“restricted”: Dataset is published, but accessible only under certain conditions.
“closed”: Dataset is published, but accessible only by the owners.
“error”: Dataset is in an error state.
“unknown”: Dataset is in an unknown state.
- Parameters:
id (Dict) – Standard dataset id.
- Returns:
Details dictionary of the dataset.
- get_files(id: Dict) List[RemoteFile] [source]
Retrieves list of files of the specified dataset.
- Parameters:
id (Dict) – Standard dataset identifier.
- Returns:
List of dataset files (RemoteFile).
- Raises:
ValueError("Operation not permitted") – If files are restricted.
ValueError("Invalid dataset id") – If invalid dataset identifier.
Module contents
- class fairly.client.Client(repository_id: str = None, **kwargs)[source]
Bases:
ABC
- config
Configuration options
- Type:
Dict
- _session
HTTP session object
- Type:
Session
- _datasets
Public dataset cache
- Type:
Dict
- _account_datasets
Account dataset cache
- Type:
List
- Class Attributes:
REGEXP_URL: Regular expression to validate URL address. REQUEST_FORMAT: Request data format CHUNK_SIZE: Chunk size in bytes to transfer data (default = 65536)
- CHUNK_SIZE = 262144
- REGEXP_URL = re.compile('(http(s)?):\\/\\/(www\\.)?[a-z\\d@:%._\\+~#=-]{2,256}\\.[a-z]{2,6}\\b([-a-z\\d@:%_\\+.~#?&//=]*)', re.IGNORECASE)
- REQUEST_FORMAT = 'json'
- property client_id: str
Client identifier
- create_dataset(metadata=None) RemoteDataset [source]
Creates a dataset with the specified metadata
- Parameters:
metadata – Metadata of the dataset (optional)
- Returns:
Dataset
- Raises:
ValueError("Invalid metadata") –
ValueError("Invalid metadata", validation_result) –
- delete_dataset(id, **kwargs) None [source]
Deletes dataset from the repository.
- Parameters:
id – Dataset identifier.
**kwargs – Other identifier arguments.
- delete_file(dataset, file) None [source]
- download_file(file: RemoteFile, path: str = None, name: str = None, notify: Callable = None) LocalFile [source]
Downloads a remote file.
- Parameters:
file (RemoteFile) – Remote file.
path (str) – Local path of the file (optional).
name (str) – Local name of the file (optional).
notify (Callable) – Notification callback method (optional).
- Returns:
Local file object.
- Raises:
ValueError("No URL address") – If remote file has no URL address.
IOError("Invalid MD5 checksum") – If MD5 checksum is invalid.
- get_account_datasets(refresh: bool = False) List[RemoteDataset] [source]
- classmethod get_config(**kwargs) Dict [source]
- classmethod get_config_parameters() Dict [source]
Returns configuration parameters.
- Returns:
Dictionary of configuration parameters. Keys are the parameter names, values are the descriptions.
- get_dataset(id=None, refresh: bool = False, **kwargs) RemoteDataset [source]
- get_dataset_id(id=None, **kwargs) Dict [source]
Returns standard dataset identifier
- Parameters:
id – Dataset identifier
**kwargs – Other identifier arguments
- Returns:
Standard dataset identifier
- get_dataset_plain_id(id: Dict) str [source]
Returns plain standard dataset identifier.
- Parameters:
id (Dict) – Standard dataset identifier.
- Returns:
Plain standard dataset identifier.
- abstract get_details(id: Dict) Dict [source]
Returns standard details of the specified dataset.
- Details dictionary:
title (str): Title
url (str): URL address
doi (str): DOI
status (str): Status
size (int): Total size of data files in bytes
created (datetime.datetime): Creation date and time
modified (datetime.datetime): Last modification date and time
- Possible statuses are as follows:
“draft”: Dataset is not published yet.
“public”: Dataset is published and is publicly available.
“embargoed”: Dataset is published, but is under embargo.
“restricted”: Dataset is published, but accessible only under certain conditions.
“closed”: Dataset is published, but accessible only by the owners.
“error”: Dataset is in an error state.
“unknown”: Dataset is in an unknown state.
- Parameters:
id (Dict) – Standard dataset id
- Returns:
Details dictionary of the dataset.
- abstract get_files(id: Dict) List[RemoteFile] [source]
- get_metadata(id: Dict) Metadata [source]
Returns standard metadata of the specified dataset
- Parameters:
id (Dict) – Standard dataset id
- Returns:
Standard metadata
- get_versions(id, refresh: bool = False, **kwargs) List[RemoteDataset] [source]
Returns datasets of all available versions of the specified dataset
- Parameters:
id – Dataset identifier
refresh (bool) – Set True to refresh versions (default = False)
- Returns:
List of datasets of all available versions
- classmethod normalize_value(name: str, val) Any [source]
Normalized metadata attribute value
- Parameters:
name (str) – Attribute name
val – Attribute value
- Returns:
Normalized attribute value
- classmethod parse_id(id: str)[source]
Parses the specified identifier.
- Parameters:
id – Identifier.
- Returns:
Tuple of identifier type and value.
- property repository_id: str
Repository identifier of the client
- save_config(save_environment=False) None [source]
Saves client configuration.
- Parameters:
save_environment – Set True to save environment variables (default False)
- abstract save_metadata(id: Dict, metadata: Metadata) None [source]
Saves metadata of the specified dataset.
- Parameters:
id (Dict) – Standard dataset id.
metadata (Metadata) – Metadata to be saved.
- abstract classmethod supports_folder() bool [source]
Returns if folders are supported.
- upload_file(dataset, file, notify: Callable = None) RemoteFile [source]