SSH into nodes#
Sometimes, you need to directly SSH into a kubernetes node to troubleshoot an issue. This document describes how to do that on various cloud providers.
Make sure you are authenticated with gcloud
projectwe are operating on, so
gcloudknows where to look:
gcloud config set project <name-of-project>
You can find the name of the project under
cluster.yamlfile for the cluster.
Find the name of the node you want to login to, usually via
kubectl get node. You can also find out the node a specific pod is on by
kubectl get node -o wide.
SSH into the node with
gcloud compute ssh <node-name>. This will set you up with a user who has
sudopermissions on the node, so you can poke around!
Make sure you are logged in to the
awscommandline tool, and authenticated as yourself to have access to AWS organization under which this cluster lives. You can validate that with
aws sts get-caller-identity- the output should include your personal username, not that of the hub deployer!
You also need the AWS Session Manager installed.
Get the instance id of the node. Unlike with GCP, on AWS the instance id is not the same as the node-name reported by
kubectl get nodeor
kubectl get pod -o wide. The instance name is on the kubernetes node object as a label with name
alpha.eksctl.io/instance-id. You can get the entire object’s definition with
kubectl get node <node-name> -o yaml, and pick out the
alpha.eksctl.io/instance-idfrom there. This is of the form
Get the region of the node. From the output you got in step 3, you can look at the label
topology.kubernetes.io/regionto get the region. For us, it’s often
us-west-2(as that is where a lot of scientific data is stored)
You can now ssh with:
aws ssm start-session --target <instance-id> --region <region>
This will put you on the node with the
shshell, which is missing a lot of the features we expect from interactive shells today. You can get on bash with
You will be a user with full
sudoaccess, so you can troubleshoot to your heart’s content.