JupyterHub configuration¶
The JupyterHub service is installed using Zero to JupyterHub as a sub-chart, so there are a lot of settings in the values.yaml
file for the nublado
application that should not need to be changed.
Documented here are the settings that may need to be changed for different Phalanx environments.
Settings that start with jupyterhub
are Zero to JupyterHub configuration settings and are passed to the sub-chart.
Settings with other prefixes are Phalanx-specific.
Database¶
By default, JupyterHub uses the internal PostgreSQL server deployed by Phalanx to store its session database. However, use of that database server is not recommended for anything other than testing. Production deployments should instead provide an infrastructure database and configure JupyterHub to use it.
hub.internalDatabase
Set this to false when using an infrastructure database. The default is true, indicating that the Phalanx-internal database service should be used.
jupyterhub.hub.db.url
Set this to the URL of the PostgreSQL database that should be used for the session database. The default is to use the Phalanx-internal database service. Use the value
postgresql://nublado@cloud-sql-proxy.nublado/nublado
when using Cloud SQL (see Cloud SQL).jupyterhub.hub.db.upgrade
Set this to true to enable automatic database upgrades when JupyterHub has been upgraded. This is false by default out of caution, which will cause JupyterHub to fail to start if a database schema upgrade is needed.
If you wish to be conservative, you can enable it only when you’re intentionally upgrading JupyterHub and then disable it again after the upgrade. This helps avoid accidentally applying a major JupyterHub upgrade without being aware that you’re doing so, since most major upgrades come with schema changes.
Cloud SQL¶
When running a Phalanx environment on Google Kubernetes Engine, using Cloud SQL for the Nublado JupyterHub session database is strongly recommended.
When using Cloud SQL, Nublado always uses workload identity via the Cloud SQL Auth Proxy to gain access to the database.
Configuring Cloud SQL therefore requires creating a Google service account with the cloudsql.client
role and binding it to the Kubernetes service account cloud-sql-proxy
in the nublado
namespace of the Phalanx environment.
Then, set the following configuration settings:
cloudsql.enabled
Set this to true when using Cloud SQL.
cloudsql.instanceConnectionName
Database instance connection name that is hosting the JupyterHub session database. This is shown in the GCP console after you have created the Cloud SQL database.
cloudsql.serviceAccount
Name of the Google service account configured as described above. This service account must have an IAM binding to the Kubernetes service account
cloud-sql-proxy
in thenublado
namespace of the Phalanx environment. See workload identity for more information.
Also set jupyterhub.hub.db.url
to postgresql://nublado@cloud-sql-proxy.nublado/nublado
as described in Database.
The last portion of that URL (nublado
) names the Cloud SQL database used for the session database.
Naming that database nublado
is recommended, but it can be named anything you choose as long as the URL is consistent.
Using a separate database solely for the JupyterHub session database is strongly recommended.
See the Google documentation for more information about Cloud SQL and the Cloud SQL Auth Proxy.
The following additional settings are supported for configuring how the Cloud SQL Auth Proxy pod is deployed in Kubernetes. You will not normally need to set them.
cloudsql.affinity
Affinity rules for the Cloud SQL Auth Proxy pod.
cloudsql.nodeSelector
Node selector rules for the Cloud SQL Auth Proxy pod.
cloudsql.podAnnotations
Additional annotations to add to the Cloud SQL Auth Proxy pod.
cloudsql.resources
Resource limits and requests for the Cloud SQL Auth Proxy pod. The defaults are chosen based on observed metrics from the JupyterHub running on Google Kubernetes Engine with a light user load.
cloudsql.tolerations
Toleration rules for the Cloud SQL Auth Proxy pod.
The following additional settings control what version of the Cloud SQL Auth Proxy is used. By default, the latest stable relesae is used. You will not normally need to change any of these settings.
cloudsql.image.repository
Docker repository from which to get the Cloud SQL Auth Proxy image.
cloudsql.image.pullPolicy
Pull policy for the Cloud SQL Auth Proxy image. The default is
IfNotPresent
.cloudsql.image.tag
Tag of the Cloud SQL Auth Proxy image.
Automatic lab shutdown¶
JupyterHub supports automatically shutting down labs after either a period of idle time or after a maximum age. This is controlled by the following settings.
jupyterhub.cull.enabled
Set to false if you want to disable shutting down labs automatically.
jupyterhub.cull.timeout
Idle timeout in seconds. If a lab has been idle for longer than this length of time, it will be automatically shut down. The default is 2592000 (30 days).
jupyterhub.cull.maxAge
Maximum age of a lab in seconds. Any lab that has been running for longer than this period of time will be automatically shut down whether it is active or not. The default is 5184000 (60 days).
Path prefix¶
jupyterhub.hub.baseUrl
The path prefix to use for the user interface to JupyterHub. The default is
/nb
. You probably do not want to change this unless you are trying to run multiple instances of Nublado in the same Phalanx environment for some reason.
Image¶
jupyterhub.hub.image.name
Docker repository for the JupyterHub image to use. The default is to use the custom JupyterHub image built by Nublado.
jupyterhub.hub.image.tag
Tag of the JupyterHub image to use. You may need to override this setting when testing unreleased images.
Due to limitations in Helm’s handling of sub-charts, this version, unlike the version of other components such as the controller, does not automatically default to the
appVersion
of thenublado
chart. It therefore must be updated invalues.yaml
whenever a new version of Nublado is released. This is normally done as part of the release process.
Timeouts¶
hub.timeout.startup
How long to wait in seconds for the JupyterLab process to start responding to network requests after the lab pod has started. Empirically, this sometimes takes longer than 60 seconds for sciplat-lab images for reasons that we do not currently understand. The default is 90 seconds.
Phalanx internals¶
secrets.templateSecrets
Set this to true if the Phalanx environment has been converted to the new secrets management system. See the Phalanx documentation for more information.