Persistent storage for k3s cluster

Kubernetes pods are ephemeral by design. Their replicas can be scaled up and down. They come and go. They can move around to various nodes in the system. When we want to store persistent data, this becomes an issue. Storing data locally on a node is possible, but tying a pod to a node removes many of the benefits of running on Kubernetes. Fortunately, Kubernetes provides a persistent storage system that lets us keep the ephemeral nature of our pods but still hang on to our precious data. We're going to set that up for our cluster in this article.

Materials needed

To follow along with this article you will need a running k3s cluster. You will also need a working NFS server. I have a separate article on how to set up a Raspberry Pi-based NFS server. Check that out first if you don't already have an NFS server. The files we create in this article can be downloaded here: https://gitlab.com/carpie/k3s_storage/-/archive/master/k3s_storage-master.zip

Kubernetes storage options

Kubernetes has many many storage options available. So how do we decide? Well, many of the options are provided by cloud services. Those are probably not what were after when building a edge cluster. There are other options for using the node's file system. But those limit our pods to specific nodes or have other limitations that we don't want. After poking through the options, a classic NFS server seems like the best fit for out cluster. It's simple, reliable, easy to set up and configure, and it runs well on a Raspberry Pi. So that's what we'll use in this article.

Persistent storage in Kubernetes

Let's talk about how Kubernetes specifies persistent storage. The configuration is split into two parts, PersistentVolumes, or PVs, and PersistentVolumeClaims, or PVCs. Kubernetes supports dynamic allocation of PVs. In other words, we can configure many PVs with specific access modes and capacities, then make a claim with a PVC and Kubernetes will select a PV that meets that criteria and bind those two together. I actually set up my storage system this way originally. I had a bunch of generically named PVs exported from my NFS server and let Kubernetes pick for me when a made a claim. Over time, though, I migrated to more explicitly created PVs with names specific to what they are for ease of use such as selected backups and what not. I find binding PVs and PVCs at creation time is more intuitive in a system where I'm in control of both the NFS server and the Kubernetes cluster. So, that is the method we'll use in this article.

Adding persistent storage to our docker registry

Ok, enough talk! Let's build something. The first immediate need for persistent storage is our docker registry we created a few articles ago. We've been using emptyDir which will persist our docker images so long as the registry always runs on the same node. Let's change that to real persistent storage.

NFS server configuration

We'll start at the NFS server. We first need to login to the server. My server IP is 192.168.0.3, so I'll just ssh into that.

ssh [email protected]

Let's make a directory for our registry on the NFS server.

mkdir -p /opt/nfs/registry

Now, we'll put an entry in /etc/exports for it.

/opt/nfs/registry 192.168.0.48/29(rw,sync,no_subtree_check,no_root_squash)

We have to add an extra option, no_root_squash, because many of the docker images we use run as root in the container. Without this, those images will not be able to write to the file system. We're using CIDR format here to limit access to these exports to the IPs our cluster Pis use. By specifying a subnet of 192.168.0.48 with 29 mask bits, that will limit access to this export to nodes with IPs of 192.168.0.49 to 192.168.0.54 which covers our nodes' IP range and gives us a little room for expansion. We didn't want to use the * entry here as that would have left this export as read/write to any host on the local network. If we wanted to limit it to just the nodes we have, we could have individual listed each node, e.g. 192.168.0.50(options),192.168.0.51(options) etc. If you are interested in CIDR format, you can use a tool like subnet-calculator to plan out your IP address ranges.

Ok, let's save that and quit. Now we tell NFS our exports have changed.

sudo exportfs -ar

That's all we need to do on the NFS side. For future directories, we can just make the directory, duplicate the export line we just entered and change the path. That's it.

Creating the PV and PVC for the regsitry

Now let's create the PV and PVC for our registry. Let's create a file named registry-pv.yaml. First the PV.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: docker-registry-pv
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  nfs:
    path: /opt/nfs/registry
    server: 192.168.0.3
  persistentVolumeReclaimPolicy: Retain
  claimRef:
    namespace: default
    name: docker-registry-pvc
---

We name the PV docker-registry-pv.

capacity is a required entry. It is used mostly for dynamic allocation which we're not using. Also, NFS won't enforce the capacity limit, so it doesn't matter much what we put here. Well just use 1Gi.

For accessModes we will put ReadWriteOnce because we want to be able to write and only one docker node will be running. The other modes, if you are curious, are ReadWriteMany and ReadOnlyMany.

For the NFS specific portion, we tell Kubernetes the full path of our export and of course the server IP.

We set persistentVolumeReclaimPolicy to Retain. The other option is Delete. This tell Kubernetes what to do with the storage if we delete the PV. For NFS I don't think either option actually does anything. But retaining our data is our desire, so we'll be explicit about that.

Finally we use claimRef to tell Kubernetes that we only want it to assign this PV to a PVC specifically named docker-registry-pvc which we will create in a moment. Without this section, Kubernetes would give this PV to the first PVC that requested the same access mode and was under the capacity limit.

Now let's create the PVC. We add it to the same file just below the PV record.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: docker-registry-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

We name it docker-registry-pvc. We specify a matching access mode. And we request 1 gig of capacity.

Ok, save and quit. Let's apply the yaml file.

kubectl apply -f registry-pv.yaml

Now we can check both the PV and the PVC.

$ kubectl get pv
NAME                 CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                         STORAGECLASS   REASON   AGE
docker-registry-pv   1Gi        RWO            Retain           Bound    default/docker-registry-pvc                           18s

$ kubectl get pvc
NAME                  STATUS   VOLUME               CAPACITY   ACCESS MODES   STORAGECLASS   AGE
docker-registry-pvc   Bound    docker-registry-pv   1Gi        RWO                           28s
$

Notice that they are bound to each other. Now we can use the PVC name as a volume in any pod configuration file.

Updating docker registry to use the PVC

The next step is to switch out that emptyDir volume with our newly created PVC. Let's edit registry.yaml we created in the docker registry article. The swapped section should look like this:

      volumes:
      - name: storage
        persistentVolumeClaim:
          claimName: docker-registry-pvc

The volume is still named storage so, to our docker registry image, nothing has changed. But underneath, instead of going a directory on the host, the NFS mount will be mounted instead and the registry will read and write there.

Ok, save and quit that. And reapply our configuration to update our registry deployment.

kubectl apply -f registry.yaml

That's it!

Testing it out

Now let's make sure it's working. If we ssh into our NFS server and do an ls -LR /opt/nfs/registry we will see it's empty. Let's push something to the registry.

We'll use the bride API from a previous article. Here are the steps for that operation from the previous article. Remember, you may have different registry names!

git clone https://gitlab.com/carpie/bride-api
cd bride-api
cp /usr/bin/qemu-arm-static .
docker build -f Dockerfile.manual -t docker.carpie.net/bride-api:v1.0.0 .
docker push docker.carpie.net/bride-api:v1.0.0

Just a simple review of the steps. We get the source, copy qemu, build the image, and push it to the registry.

Now back on the NFS server, if we repeat the ls -LR /opt/nfs/registry we will that the docker registry has written data (lots of data!) to the directory! It's working! Now our docker images will be persisted for as long as we want no matter which node the registry runs on.

The end...

With storage covered we've come to the end of this instructional series. Let's take a look back at what we've built. We have a fully functional multi-node Kubernetes cluster. We can automatically obtain authentic TLS certificates for encryption. We can create custom application images, from the convenience of our PC, and deploy them to our own local docker registry. We have the ability to persist data using our very own NFS server. We are now armed with all the tools we need to deploy nearly anything on our cluster! No small feat!

This is the end of the instructional series, but I still have a few articles I'd like to make about things I have deployed on my cluster. Also, my cluster is due an upgrade, so there may be a article about that if things have changed significantly. Finally, I'd love to hear about your cluster. Things you've deployed. Modifications you've made. Ideas how I could make my cluster better. Things like that. Please share in the comments here or on the video version on YouTube!

With that, I hope you've enjoyed this series. That's all for this article. Thanks for reading!