GCS Downloader

A GCSDownloader is an object which handles connections to an HTTPS-enabled collection and single file downloads over HTTPS.

It primarily features two APIs:

  1. Initialization and use as a context manager

  2. GCSDownloader.read_file() to get a single file by URL

class globus_sdk.experimental.gcs_downloader.GCSDownloader(app, *, https_client=None, transfer_client=None, transport=None)[source]

An object which manages connection and authentication state to enable HTTPS downloads from a specific Globus Connect Server.

The initial request to read a file features support for determining authentication requirements dynamically, and subsequent requests will reuse that authentication data.

Using a single GCSDownloader to access distinct collections is not supported. A separate downloader should be used for each collection.

Downloaders may be used as context managers, in which case they automatically call their close() method on exit:

>>> with GCSDownloader(app) as downloader:
>>>     print(downloader.read_file(url))
Parameters:
  • app (globus_sdk.GlobusApp) – The GlobusApp used to authenticate calls to this server.

  • https_client (GCSCollectionHTTPSClient | HTTPSClientConstructor | None) – The underlying client used for the file read request. Typically omitted. When not provided, one will be constructed on demand by the downloader. As an alternative to providing a client, a callable factory may be passed here, which will be given the collection_client_id, default_scope_requirements, and base_url and must return a new client.

  • transfer_client (globus_sdk.TransferClient | None) – A client used when detecting collection information. Typically omitted. When not provided, one will be constructed on demand by the downloader.

  • transport (globus_sdk.transport.RequestsTransport | None) – A transport for the downloader, used for authentication sniffing operations. When a client is built by the downloader it will inherit this transport.

close()[source]

Close all resources which are owned by this downloader.

read_file(file_uri: str, *, as_text: Literal[True]) str[source]
read_file(file_uri: str, *, as_text: Literal[False]) bytes
read_file(file_uri: str) str

Given a file URI on a GCS Collection, read the data.

Parameters:
  • file_uri – The full URI of the file on the collection which is being downloaded.

  • as_text – When True, the file contents are decoded into a string. Set to False to retrieve data as bytes.

Caution

The file read is done naively as a GET request. This may be unsuitable for very large files.

class globus_sdk.experimental.gcs_downloader.HTTPSClientConstructor(*args, **kwargs)[source]

A protocol which defines the factory type used to customize a GCSDownloader.

class globus_sdk.experimental.gcs_downloader.GCSCollectionHTTPSClient(collection_client_id, default_scope_requirements=(), *, environment=None, base_url=None, app=None, app_scopes=None, authorizer=None, app_name=None, transport=None, retry_config=None)[source]

A dedicated client type for an HTTPS-capable Collection used for file downloads.

Users should generally not instantiate this class directly, but instead rely on GCSDownloader to properly initialize these clients.

Parameters:
  • collection_client_id (str) – The ID of the collection.

  • default_scope_requirements (t.Iterable[globus_sdk.Scope]) – The scopes needed for HTTPS access to the collection. This should contain the https scope for the collection and the data_access scope if applicable.

  • app (globus_sdk.GlobusApp | None) – A GlobusApp which will be used for handling authorization and storing and validating tokens. Passing an app will automatically include a client’s default scopes in the app’s scope requirements unless specific app_scopes are given. If app_name is not given, the app’s app_name will be used. Mutually exclusive with authorizer.

  • app_scopes (list[globus_sdk.scopes.Scope] | None) – Optional list of Scope objects to be added to app’s scope requirements instead of default_scope_requirements. Requires app.

  • authorizer (GlobusAuthorizer | None) – A GlobusAuthorizer which will generate Authorization headers. Mutually exclusive with app.

  • app_name (str | None) – Optional “nice name” for the application. Has no bearing on the semantics of client actions. It is just passed as part of the User-Agent string, and may be useful when debugging issues with the Globus Team. If both``app`` and app_name are given, this value takes priority.

  • base_url (str | None) – The URL for the service. Most client types initialize this value intelligently by default. Set it when inheriting from BaseClient or communicating through a proxy. This value takes precedence over the class attribute of the same name.

  • transport (globus_sdk.transport.RequestsTransport | None) – A RequestsTransport object for sending and retrying requests. By default, one will be constructed by the client.

  • retry_config (globus_sdk.transport.RetryConfig | None) – A RetryConfig object with parameters to control request retry behavior. By default, one will be constructed by the client.

property default_scope_requirements: list[Scope]

Scopes that will automatically be added to this client’s app’s scope_requirements during _finalize_app.

For clients with static scope requirements this can just be a static value. Clients with dynamic requirements should use @property and must return sane results while the Base Client is being initialized.

Example Usage

import globus_sdk
from globus_sdk.experimental.gcs_downloader import GCSDownloader

# SDK Tutorial Client ID - <replace this with your own client>
CLIENT_ID = "61338d24-54d5-408f-a10d-66c06b59f6d2"

# this example is a path on the Globus Tutorial Collections
FILE_URL = (
    "https://m-d3a2c3.collection1.tutorials.globus.org/home/share/godata/file2.txt"
)

with globus_sdk.UserApp("gcs-downloader-demo", client_id=CLIENT_ID) as app:
    with GCSDownloader(app) as downloader:
        print(downloader.read_file(FILE_URL))