NubladoSpawner¶
- class rubin.nublado.spawner.NubladoSpawner(*args, **kwargs)¶
Bases:
Spawner
Spawner class that sends requests to the RSP lab controller.
Rather than having JupyterHub spawn labs directly and therefore need Kubernetes permissions to manage every resource that a user’s lab environment may need, the Rubin Science Platform manages all labs in a separate privileged lab controller process. JupyterHub makes RESTful HTTP requests to that service using either its own credentials or the credentials of the user.
See SQR-066 for the full design.
Notes
This class uses a single process-global shared
httpx.AsyncClient
to make all of its HTTP requests, rather than using one per instantiation of the spawner class. Each user gets their own spawner, so this approach allows all requests to share a connection pool.This client is created on first use and never shut down. To be strictly correct, it should be closed properly when the JupyterHub process is exiting, but we haven’t yet figured out how to hook into the appropriate part of the JupyterHub lifecycle to do that.
Attributes Summary
Path to the Gafaelfawr token for JupyterHub itself.
Base URL for the Nublado lab controller.
Methods Summary
get_url
()Determine the URL of a running lab.
options_form
(spawner)Retrieve the options form for this user from the lab controller.
poll
()Check if the pod is running.
progress
()Monitor the progress of a spawn.
start
()Start the user's pod.
stop
()Delete any running pod for the user.
Attributes Documentation
- admin_token_path¶
Path to the Gafaelfawr token for JupyterHub itself.
This token will be used to authenticate to the lab controller routes that JupyterHub is allowed to call directly such as to get lab status and delete a lab.
- controller_url¶
Base URL for the Nublado lab controller.
All URLs for talking to the Nublado lab controller will be constructed relative to this base URL.
Methods Documentation
- async get_url()¶
Determine the URL of a running lab.
- Returns:
URL of the lab if we can retrieve it from the lab controller, otherwise the saved URL in the spawner object.
- Return type:
Notes
JupyterHub recommends implementing this if the spawner has some independent way to retrieve the lab URL, since it allows JupyterHub to recover if it was killed in the middle of spawning a lab and that spawn finished successfully while JupyterHub was down. This method is only called if
poll
returnsNone
.JupyterHub does not appear to do any error handling of failures of this method, so it should not raise an exception, just fall back on the stored URL and let the probe fail if that lab does not exist.
- async options_form(spawner)¶
Retrieve the options form for this user from the lab controller.
- Parameters:
spawner (
Spawner
) – Another copy of the spawner (not used). It’s not clear why JupyterHub passes this into this method.- Raises:
ControllerWebError – Raised on failure to talk to the lab controller or a failure response from the lab controller.
InvalidAuthStateError – Raised if there is no
token
attribute in the user’s authentication state. This should always be provided byrubin.nublado.authenticator.GafaelfawrAuthenticator
.
- Return type:
- async poll()¶
Check if the pod is running.
Pods that are currently being terminated are reported as not running, since we want to allow the user to immediately begin spawning a lab. If they outrace the pod termination, we’ll just join the wait for the lab termination to complete.
- Returns:
If the pod is starting, running, or terminating, return
None
. If the pod does not exist, is being terminated, or was successfully terminated, return 0. If the pod exists in a failed state, return 1.- Return type:
int or None
- Raises:
ControllerWebError – Raised on failure to talk to the lab controller or a failure response from the lab controller.
Notes
In theory, this is supposed to be the exit status of the Jupyter lab process. This isn’t something we know in the classic sense since the lab is a Kubernetes pod. We only know that something failed if the record of the lab is hanging around in a failed state, so use a simple non-zero exit status for that. Otherwise, we have no way to distinguish between a pod that was shut down without error and a pod that was stopped, so use an exit status of 0 in both cases.
- async progress()¶
Monitor the progress of a spawn.
This method is the internal implementation of the progress API. It provides an iterator of spawn events and then ends when the spawn succeeds or fails.
- Yields:
dict – Dictionary representing the event with fields
progress
, containing an integer completion percentage, andmessage
, containing a human-readable description of the event.- Return type:
AsyncIterator
[dict
[str
,int
|str
]]
Notes
This method must never raise exceptions, since those will be treated as unhandled exceptions by JupyterHub. If anything fails, just stop the iterator. It doesn’t do any HTTP calls itself, just monitors the events created by
start
.Uses the internal
_start_future
attribute to track when the relatedstart
method has completed.
- start()¶
Start the user’s pod.
Initiates the pod start operation and then waits for the pod to spawn by watching the event stream, converting those events into the format expected by JupyterHub and returned by
progress
. Returns only when the pod is running and JupyterHub should start waiting for the lab process to start responding.- Returns:
Running task monitoring the progress of the spawn. This task will be started before it is returned. When the task is complete, it will return the cluster-internal URL of the running Jupyter lab process.
- Return type:
Notes
The actual work is done in
_start
. This is a tiny wrapper to do bookkeeping on the event stream and record the running task so thatprogress
can notice when the task is complete and return.It is tempting to only initiate the pod spawn here, return immediately, and then let JupyterHub follow progress via the
progress
API. However, this is not what JupyterHub is expecting. The entire spawn process must happen before thestart
method returns for the configured timeouts to work properly; oncestart
has returned, JupyterHub only allows a much shorter timeout for the lab to fully start.Also, JupyterHub handles exceptions from
start
and correctly recognizes that the pod has failed to start, but exceptions fromprogress
are treated as uncaught exceptions and cause the UI to break. Therefore,progress
must never fail and all operations that may fail must be done instart
.
- async stop()¶
Delete any running pod for the user.
If the pod does not exist, treat that as success.
- Raises:
ControllerWebError – Raised on failure to talk to the lab controller or a failure response from the lab controller.
- Return type: