Resizing Prometheus’ disk#
We may need to resize Prometheus’ disk that collects metrics data as we store more and more data.
On GCP clusters, the storage classes are set by default to permit auto-expansion. Therefore, simply defining a new persistent volume size in the support chart values and redeploying it, should suffice. However, this may not be the case on other cloud providers. The below steps will walk you through resizing the disk.
Resizing the disk#
# Set the KUBE_EDITOR env var to point to a text editor you're comfortable with
export KUBE_EDITOR="/usr/bin/nano"
# Set the name of the cluster to work against
export CLUSTER_NAME=...
# Authenticate against the cluster
deployer use-cluster-credentials $CLUSTER_NAME
Set the desired size of the Prometheus server persistent volume in the relevant
support.values.yaml
file.prometheus: server: persistentVolume: size: <desired-size>
Check the reclaim policy on the persistent volume.
# List all the PVs. They are not namespaced. kubectl get pv
Edit persistent volume’s reclaim policy to be
Retain
if it is not already. This will prevent us from losing the data Prometheus has already collected.kubectl edit pv <pv-name>
Check the value of
ALLOWVOLUMEEXPANSION
of the default storage class, identified by(default)
next to it’s name.kubectl get storageclass
Set
ALLOWVOLUMEEXPANSION
totrue
if it is not. This will allow the persistent volumes to be dynamically resized.kubectl patch storageclass <storage-class-name> --patch '{\"allowVolumeExpansion\": true}'
Note
At the point, you could try to redeploy the support chart and see if it succeeds. If it doesn’t, continue with the steps.
Delete the persistent volume claim for the prometheus server. Persistent volume claims cannot be patched so we will need to recreate it.
# List all PVCs in the support namespace kubectl -n support get pvc # Delete the prometheus server PVC kubectl -n support delete pvc support-prometheus-server
In another terminal with the
CLUSTER_NAME
variable set, redeploy the support chart. It should fail with the PVC in aPending
state.deployer deploy-support $CLUSTER_NAME
Edit the persistent volume to have the same UID and resource version as the newly created PVC under
spec.claimRef
.# Get the UID and resource version of the PVC kubectl -n support get pvc support-prometheus-server -o yaml # Edit the PV to reference these values under `spec.claimRef` kubectl edit pv <pv-name>
Delete the prometheus server pod and check that it comes back up.
kubectl -n support delete pod support-prometheus-server-<hash> kubectl -n support get pods --watch
Redeploy the support chart again and this time it should succeed.
deployer deploy-support $CLUSTER_NAME