GCS Downloader¶
A GCSDownloader is an object which handles connections to an
HTTPS-enabled collection and single file downloads over HTTPS.
It primarily features two APIs:
Initialization and use as a context manager
GCSDownloader.read_file()to get a single file by URL
- class globus_sdk.experimental.gcs_downloader.GCSDownloader(app, *, https_client=None, transfer_client=None, transport=None)[source]¶
An object which manages connection and authentication state to enable HTTPS downloads from a specific Globus Connect Server.
The initial request to read a file features support for determining authentication requirements dynamically, and subsequent requests will reuse that authentication data.
Using a single
GCSDownloaderto access distinct collections is not supported. A separate downloader should be used for each collection.Downloaders may be used as context managers, in which case they automatically call their
close()method on exit:>>> with GCSDownloader(app) as downloader: >>> print(downloader.read_file(url))
- Parameters:
app (globus_sdk.GlobusApp) – The
GlobusAppused to authenticate calls to this server.https_client (GCSCollectionHTTPSClient | HTTPSClientConstructor | None) – The underlying client used for the file read request. Typically omitted. When not provided, one will be constructed on demand by the downloader. As an alternative to providing a client, a callable factory may be passed here, which will be given the
collection_client_id,default_scope_requirements, andbase_urland must return a new client.transfer_client (globus_sdk.TransferClient | None) – A client used when detecting collection information. Typically omitted. When not provided, one will be constructed on demand by the downloader.
transport (globus_sdk.transport.RequestsTransport | None) – A transport for the downloader, used for authentication sniffing operations. When a client is built by the downloader it will inherit this transport.
- read_file(file_uri: str, *, as_text: Literal[True]) str[source]¶
- read_file(file_uri: str, *, as_text: Literal[False]) bytes
- read_file(file_uri: str) str
Given a file URI on a GCS Collection, read the data.
- Parameters:
file_uri – The full URI of the file on the collection which is being downloaded.
as_text – When
True, the file contents are decoded into a string. Set toFalseto retrieve data as bytes.
Caution
The file read is done naively as a GET request. This may be unsuitable for very large files.
- class globus_sdk.experimental.gcs_downloader.HTTPSClientConstructor(*args, **kwargs)[source]¶
A protocol which defines the factory type used to customize a GCSDownloader.
- class globus_sdk.experimental.gcs_downloader.GCSCollectionHTTPSClient(collection_client_id, default_scope_requirements=(), *, environment=None, base_url=None, app=None, app_scopes=None, authorizer=None, app_name=None, transport=None, retry_config=None)[source]¶
A dedicated client type for an HTTPS-capable Collection used for file downloads.
Users should generally not instantiate this class directly, but instead rely on
GCSDownloaderto properly initialize these clients.- Parameters:
collection_client_id (str) – The ID of the collection.
default_scope_requirements (t.Iterable[globus_sdk.Scope]) – The scopes needed for HTTPS access to the collection. This should contain the https scope for the collection and the data_access scope if applicable.
app (globus_sdk.GlobusApp | None) – A
GlobusAppwhich will be used for handling authorization and storing and validating tokens. Passing anappwill automatically include a client’s default scopes in theapp’s scope requirements unless specificapp_scopesare given. Ifapp_nameis not given, theapp’sapp_namewill be used. Mutually exclusive withauthorizer.app_scopes (list[globus_sdk.scopes.Scope] | None) – Optional list of
Scopeobjects to be added toapp’s scope requirements instead ofdefault_scope_requirements. Requiresapp.authorizer (GlobusAuthorizer | None) – A
GlobusAuthorizerwhich will generate Authorization headers. Mutually exclusive withapp.app_name (str | None) – Optional “nice name” for the application. Has no bearing on the semantics of client actions. It is just passed as part of the User-Agent string, and may be useful when debugging issues with the Globus Team. If both``app`` and
app_nameare given, this value takes priority.base_url (str | None) – The URL for the service. Most client types initialize this value intelligently by default. Set it when inheriting from BaseClient or communicating through a proxy. This value takes precedence over the class attribute of the same name.
transport (globus_sdk.transport.RequestsTransport | None) – A
RequestsTransportobject for sending and retrying requests. By default, one will be constructed by the client.retry_config (globus_sdk.transport.RetryConfig | None) – A
RetryConfigobject with parameters to control request retry behavior. By default, one will be constructed by the client.
- property default_scope_requirements: list[Scope]¶
Scopes that will automatically be added to this client’s app’s scope_requirements during _finalize_app.
For clients with static scope requirements this can just be a static value. Clients with dynamic requirements should use @property and must return sane results while the Base Client is being initialized.
Example Usage¶
import globus_sdk
from globus_sdk.experimental.gcs_downloader import GCSDownloader
# SDK Tutorial Client ID - <replace this with your own client>
CLIENT_ID = "61338d24-54d5-408f-a10d-66c06b59f6d2"
# this example is a path on the Globus Tutorial Collections
FILE_URL = (
"https://m-d3a2c3.collection1.tutorials.globus.org/home/share/godata/file2.txt"
)
with globus_sdk.UserApp("gcs-downloader-demo", client_id=CLIENT_ID) as app:
with GCSDownloader(app) as downloader:
print(downloader.read_file(FILE_URL))