Client guide¶
This page describes the use of the rubin.nublado.client.NubladoClient
class and its provided testing classes.
Installing the Client¶
The client can be installed from PyPI with:
pip install rubin-nublado-client
Using the Client¶
The NubladoClient
is designed to make interaction with JupyterHub and Jupyterlab, as they are configured in the RSP environment, easy.
A particular instance of the client represents a single user in a particular RSP environment.
A user, in this context, means a token (which will have a set of scopes allowing various actions within the RSP) bound to a username.
Both of these will be available in the service you are writing with each request, in the X-Auth-Request-Token
and X-Auth-Request-User
headers on the request.
You should not, in general, need to go to Gafaelfawr to extract any further information about the token, but you must be prepared to handle 401s and 403s in case the token you have is invalid, expired, or does not grant sufficient scope for the service you want to use.
Sequence of Events¶
A typical interaction with the client usually looks like this:
Authenticate to the Hub with the
auth_to_hub()
method.Determine whether or not you have a running lab with
is_lab_stopped()
.If you need to, spawn a lab with
spawn_lab()
.Wait for the lab to spawn by looping through
watch_spawn_progress()
until you get a progress message indicating the lab is ready.Authenticate to the Lab with
auth_to_lab()
Use
open_lab_session()
as a context manager to interact with your lab.Do whatever it is you wanted to do with the lab; more below.
When done, use
stop_lab()
to shut down the lab, if desired.
The client_test
test in the client test suite steps through this process and may be a useful model.
Interacting With The Lab¶
NubladoClient
provides three methods of interacting with a spawned lab. These are methods on the JupyterLabSession
object you will have available inside the session context manager. They are:
run_python()
. This runs a string representing arbitrary Python code and returns streamed results (which is to say,stdout
, within each cell). Of course, this code will run in the Lab environment specified by the session, which will be bound to a particular kernel (usuallyLSST
within the RSP), and will have access to whatever is installed in that kernel.run_notebook()
. This runs each cell of a supplied notebook usingrun_python()
, accumulating the results and ultimately returning them as a list of cell outputs.run_notebook_via_rsp_extension()
. This executes a notebook via the/rubin/execution
endpoint of the RSP Jupyter Extensions. Times Square and Noteburst use this method. The extensions run within the user lab, and the execution extension, in turn, uses nbconvert to execute notebooks and return their rendered form. If you need to execute a notebook and capture output that did not go to stdout (for instance, the Javascript created by a Bokeh call, that will ultimately run in your browser), this is at present the way to do it.
Client Use Examples¶
The following uses the client to determine whether a user Lab is running and to start it if necessary. It does not run any code within the Lab:
"""Ensure there's a running lab for the user."""
import asyncio
import structlog
from rubin.nublado.client import NubladoClient
from rubin.nublado.client.models import (
NubladoImageByClass,
NubladoImageClass,
NubladoImageSize,
User,
)
LAB_SPAWN_TIMEOUT = 90
async def ensure_lab() -> None:
"""Start a Lab if one is not present."""
client = NubladoClient(
user=User(username="some-user", token="some-token"),
base_url="https://data.example.org",
)
await client.auth_to_hub()
stopped = await client.is_lab_stopped()
if stopped:
image = NubladoImageByClass(
image_class=NubladoImageClass.RECOMMENDED,
size=NubladoImageSize.Medium,
)
await client.spawn_lab(image)
progress = client.watch_spawn_progress()
async with asyncio.timeout(LAB_SPAWN_TIMEOUT):
async for message in progress:
if message.ready:
break
asyncio.run(ensure_lab())
The next example assumes that you have already done the above–that is, you know the user already has a running Lab–and that you, for some reason, want to run FizzBuzz for n=1 through 100:
"""Run Fizzbuzz in the RSP"""
import asyncio
from rubin.nublado.client import NubladoClient
from rubin.nublado.client.models import User
client = NubladoClient(
user=User(username="some-user", token="some-token"),
base_url="https://data.example.org",
)
FIZZBUZZ = """
i=1
accum=""
while (i<=100):
if i>1:
accum += ", "
if (i%15 == 0):
accum += "Fizz Buzz"
elif (i%5 == 0):
accum += "Buzz"
elif (i%3 == 0):
accum += "Fizz"
else:
accum += str(i)
i += 1
print(accum)
"""
async def run_fizzbuzz(client: NubladoClient) -> str:
await client.auth_to_hub()
await client.auth_to_lab()
async with client.open_lab_session() as lab_session:
output = await lab_session.run_python(FIZZBUZZ)
return output
output = asyncio.run(run_fizzbuzz(client=client))
print(output)
This will display the following:
1, 2, Fizz, 4, Buzz, Fizz, 7, 8, Fizz, Buzz, 11, Fizz, 13, 14, Fizz Buzz, 16, 17, Fizz, 19, Buzz, Fizz, 22, 23, Fizz, Buzz, 26, Fizz, 28, 29, Fizz Buzz, 31, 32, Fizz, 34, Buzz, Fizz, 37, 38, Fizz, Buzz, 41, Fizz, 43, 44, Fizz Buzz, 46, 47, Fizz, 49, Buzz, Fizz, 52, 53, Fizz, Buzz, 56, Fizz, 58, 59, Fizz Buzz, 61, 62, Fizz, 64, Buzz, Fizz, 67, 68, Fizz, Buzz, 71, Fizz, 73, 74, Fizz Buzz, 76, 77, Fizz, 79, Buzz, Fizz, 82, 83, Fizz, Buzz, 86, Fizz, 88, 89, Fizz Buzz, 91, 92, Fizz, 94, Buzz, Fizz, 97, 98, Fizz, Buzz
For the next two examples, we will assume that you have a notebook called HelloGoodbye.ipynb
in your home directory. This notebook contains two cells. The first cell’s code is:
print("Hello, world!")
and the second cell’s code is:
print("Goodbye, world!")
Then the following will run the notebook via each method, compare their outputs, and if they are the same, print the outputs with the line number followed by a colon and a space before each one:
import asyncio
import json
from dataclasses import dataclass
from pathlib import Path
from rubin.nublado.client import NubladoClient
from rubin.nublado.client.models import NotebookExecutionResult, User
@dataclass
class NBResults:
session_output: list[str]
extension_output: NotebookExecutionResult
client = NubladoClient(
user=User(username="some-user", token="some-token"),
base_url="https://data.example.org",
)
notebook = Path("HelloGoodbye.ipynb")
async def run_notebook_both_ways(
client: NubladoClient, notebook: Path
) -> NBResults:
await client.auth_to_hub()
await client.auth_to_lab()
async with client.open_lab_session() as lab_session:
session_output = await lab_session.run_notebook(notebook)
extension_output = await lab_session.run_notebook_via_rsp_extension(
path=notebook
)
return NBResults(
session_output=session_output, extension_output=extension_output
)
output = asyncio.run(run_notebook_both_ways(client=client, notebook=notebook))
obj = json.loads(output.extension_output.notebook)
cells = obj["cells"]
# Check that the output is the same from both methods. We have to do a
# lot of work to pull the streaming output out of the cell.
outputs_from_extension: list[str] = []
for cell in cells:
if (
"cell_type" in cell
and cell["cell_type"] == "code"
and "outputs" in cell
and cell["outputs"]
):
cell_outputs = cell["outputs"]
for outp in cell_outputs:
if (
"output_type" in outp
and outp["output_type"] == "stream"
and "name" in outp
and outp["name"] == "stdout"
and "text" in outp
and outp["text"]
):
text_list = outp["text"]
for text in text_list:
if text:
outputs_from_extension.append(text)
if outputs_from_extension == output.session_output:
for count, line in enumerate(output.session_output):
print(f"{count+1}: {line.strip()}")
This yields:
1: Hello, World!
2: Goodbye, World!
Mocks and Testing¶
In the module rubin.nublado.client.testing
you will find the MockJupyter
class.
This provides a simulation of the RSP Nublado Hub/Proxy/Controller environment, as well as a partial simulation of the Labs it spawns.
The reason you would use this is to be able to meaningfully test your service without having to test against a live RSP or spin up your own RSP to test the service against.
Although there are quite a few additional classes within the module, MockJupyter
should be the only one you need directly, except to set up the test fixture.
Creating the Jupyter Mock Test Fixture¶
The rubin.nublado.client.testing.MockJupyter
class is fundamentally an instance of the respx
class (used for testing httpx
services), with a websocket emulator patched into it.
It depends on two other fixtures: environment_url
is a string, representing the base URL of the RSP environment, and filesystem
is a pathlib.Path
representing the home directory of the user the NubladoClient
is running as. These collectively look like:
from collections.abc import Iterator
from contextlib import asynccontextmanager
from pathlib import Path
from unittest.mock import patch
import pytest
import respx
import websockets
from nublado.rubin.client.testing import (
MockJupyter,
MockJupyterWebSocket,
mock_jupyter,
mock_jupyter_websocket,
)
@pytest.fixture
def environment_url() -> str:
return "https://data.example.org"
@pytest.fixture
def test_filesystem() -> Iterator[Path]:
with TemporaryDirectory() as td:
# Do whatever you need to do in order to set up the home
# directory contents here
yield Path(td)
@pytest.fixture
def jupyter(
respx_mock: respx.Router,
environment_url: str,
test_filesystem: Path,
) -> Iterator[MockJupyter]:
"""Mock out JupyterHub and Jupyter labs."""
jupyter_mock = mock_jupyter(
respx_mock,
base_url=environment_url,
user_dir=test_filesystem,
)
# respx has no mechanism to mock aconnect_ws, so we have to do it
# ourselves.
@asynccontextmanager
async def mock_connect(
url: str,
extra_headers: dict[str, str],
max_size: int | None,
open_timeout: int,
) -> AsyncIterator[MockJupyterWebSocket]:
yield mock_jupyter_websocket(url, extra_headers, jupyter_mock)
with patch(websockets, "connect") as mock:
mock.side_effect = mock_connect
yield jupyter_mock
Once you’ve done all that, all you will need to do is supply the test
fixture jupyter
to your unit tests along with a client to communicate with it.
The client is much simpler.
The only special things you need to do with the NubladoClient
are to configure it with the same environment URL your mock Jupyter has, and give it X-Auth-Request-User
and X-Auth-Request-Token
headers that, in real life, would come in via GafaelfawrIngress
.
It will make all its usual HTTP calls, which will be intercepted by the jupyter
test fixture and responded to appropriately.
Mocking Payloads¶
The Python code being used as a client payload is expected, in the wild, to run within an RSP kernel; usually the LSST
kernel, which is extremely heavyweight and has a great many features not found in a vanilla Python installation.
The MockJupyter
class contains a pair of methods that enable the user to register code or notebook contents with the mock, and if the mock sees those things as execution payloads, it will reply with the registered results rather than trying to actually execute them.
These methods are register_python_result()
and register_extension_result()
.
The first is used for mocking run_python()
and run_notebook()
, and the second for mocking run_notebook_via_rsp_extension()
.
For any case involving Python that uses modules outside the standard library, use the register
methods to pre-load appropriate replies for that code.
These are generally the only two methods of MockJupyter
that the service developer should use directly.
All tests should then interact with the mock Jupyter service through NubladoClient
, possibly with execution output mocked via registration.