Setup a dedicated nodepool for a hub on a shared cluster#
Important
On AWS, all clusters have dedicated nodepools for each hub.
Some hubs on shared clusters require dedicated nodepools, for a few reasons:
Helpful to pre-warm during events, as we can scale a single nodepool up/down without worrying about effects from other hubs on the same cluster.
(In the future) Helpful with cost isolation, as we can track how much a nodepool is costing us.
Setup a new nodepool in terraform, via the
<cluster-name>.tfvars
for the cluster. Add the new nodepool tonotebook_nodes
:notebook_nodes = { "<community-name>" : { min: 0, max: 100, machine_type: "<machine-type>", labels: { "2i2c.org/community": "<community-name>" }, taints: [{ key: "2i2c.org/community", value: "<community-name>", effect: "NO_SCHEDULE", }], gpu: { enabled: false, type: "", count: 0, }, resource_labels: { "community": "<community-name>", }, } }
This sets up a new node with:
Kubernetes labels so we can tell the scheduler that user pods of this hub should come to this nodepool.
Kubernetes taints so user pods of other hubs will not be scheduled on this nodepool.
GCP Resource Labels (unrelated to Kubernetes Labels!) that help us track costs. The key name here is different from (1) and (2) because it must start with a letter, and can not contain
/
.
Once done, run
terraform apply
appropriately to bring this nodepool up.Configure the hub’s helm values to use this nodepool, and this nodepool only.
jupyterhub: singleuser: nodeSelector: 2i2c.org/community: <community-name> extraTolerations: - key: 2i2c.org/community operator: Equal value: <community-name> effect: NoSchedule
Note
If this is a
daskhub
, nest these under abasehub
key.This tells JupyterHub to place user pods from this hub on the nodepool we had just created!
On AWS, all clusters have dedicated nodepools for each hub.
Node type and minimum nodepool size considerations#
When setting up a dedicated node pool for a hub, particularly a hub supporting an event, it’s important to consider the node type and minimum node size used. As there will likely only be minimal number of users until the event starts, it’s helpful to set the minimum node pool size to 0 until at least a week before the start of the event. A smaller node type is also advised until a week before the event.