Introduction
The deployment of a private container registry inside Kubernetes has been on my to-do list for a while now. DigitalOcean’s 2021 Kubernetes Challenge presented the perfect opportunity to explore this task using the popular open source registry Harbor.
Why Harbor?
Private container registries like Harbor offer many advantages over public ones (such as Docker Hub), especially for developers and/or organisations running applications (or multiple apps) at scale.
Some of Harbor’s benefits include:
- Security and vulnerability analysis of images
- Mitigating the impact of Docker Hub rate limits
- Identity integration and roles-based access control
- An extensible API and web UI
- Replication across many registries, including Harbor
Harbor is also an official Cloud Native Computing Foundation (CNCF) project. Along with strong community engagement makes Harbor a solid private registry choice.
What does this tutorial cover?
This guide covers the use of Terraform to automate and streamline the process of deploying a highly available Harbor registry on a DigitalOcean (DO) Kubernetes cluster.
Prerequisites
Before you begin this guide you’ll need the following:
- DigitalOcean Cloud Account (Referral Link) & Personal Access Token (with Read/Write permissions)
- Spaces Access Keys
- A Pre-provisioned DigitalOcean Kubernetes Cluster (DOKS) [version ≥ 1.10].
- Terraform ≥ v0.15
- Beginner to intermediate knowledge of Kubernetes & Terraform
Deployment Plan
Architecture
The architecture for this high availability deployment has Harbor’s stateless components deployed as replicas pods on a K8s cluster. The storage layer (PostgreSQL, Redis & Object Storage) is provisioned as managed resources external to the cluster but in the same region and VPC network.
It is possible to deploy Postgres & Redis (via Helm Charts) on the same DOKS cluster as Harbor. However, since DigitalOcean offers both in the form of managed services, the high availability of the storage and caching layer can be further abstracted outside the cluster and avoid the complexities that come with managing high availability databases on Kubernetes.
Additionally, a cloud load balancer and ingress controller can be deployed to enable external access to the registry. See the next tutorial for details on how to achieve this.
Automation
The Terraform module will automate the provisioning of Harbor’s requisite resources on DigitalOcean’s platform.
This includes the following:
- A Managed PostgreSQL & Redis Cluster
- A Cloud Firewall for the above databases (optional)
- Creation of the empty databases Harbor requires
- A dedicated Spaces bucket for Harbor
Once these resources are provisioned, the module will deploy Harbor on the cluster.
The module does not install an ingress controller. However, you can combine this module with one that installs your ingress controller of choice (e.g. Traefik) to enable external access to Harbor. This scenario is covered in the next tutorial.
Step 1 - Clone the Example Repository
Clone the example Terraform configuration repository https://github.com/colinwilson/example-terraform-modules/tree/terraform-digitalocean-doks-harbor
Switch to the existing_doks_cluster
directory.
Step 2 - Set the Required Input Variables
The module’s default configuration requires only four inputs. Substitute the dummy values in the terraform.tfvars
file with your DigitalOcean Personal Access Token, Spaces Access ID & Secret Key and the name of your DOKS cluster:
Step 3 - Initialize the Terraform Configuration & Provision Harbor and its Resources
Still in the example-terraform-modules/existing_doks_cluster
directory, run terraform init
to initialize your configuration.
Terraform will proceed to download the required provider plugins.
Example 'terraform init' OUTPUT. (click to expand)
Now run terraform apply
to apply your configuration and deploy the Harbor registry.
Respond to the prompt with yes
to apply the changes and begin provisioning all resources.
Example 'terraform apply' OUTPUT. (click to expand)
Once the managed Postgres, Redis and Object Storage (Spaces) are provisioned, Harbor will be deployed on the DOKS cluster inside the harbor
namespace.
You can confirm Harbor has been successfully deployed using kubectl
:
You can also view the provisioned resources via the DigitalOcean console:
Postgres & Redis Databases provisioned by the Harbor Terraform module.
Spaces bucket provisioned by the Harbor Terraform module.
Step 4 - Accessing the Harbor Registry Web UI
The module’s default configuration exposes Harbor via a Kubernetes service with an assigned Cluster IP. kubectl
’s port-forward
feature can be used to forward the Harbor service locally to provide access to the UI.
You can now open a browser, navigate to https://127.0.0.1:8443/
and login using the default username ‘admin’ and password ‘Harbor12345’:
Harbor User Interface (UI) Login.
Step 5 - Configuring the Docker Client to Access Harbor
By default Docker does not trust registries with self-signed certificates1, so the Docker daemon needs to be configured to trust Harbor’s CA certificate.
First, retrieve the CA cert from your Harbor deployment. You can either use kubectl
to retrieve it via the secret associated with Harbor’s nginx
pod:
Or download it via the Harbor UI. Having logged in, click on the default project, ‘library’ then click the ‘REGISTRY CERTIFICATE’ button to download the CA cert:
Now, using the Harbor registry’s domain name, create a directory for the certificate on the machine you plan to run docker login
from:
harbor.local
is the default domain name configured by the module for the Harbor Registry
copy the harbor-ca.crt
to this location:
You can now login to Harbor using the docker login
command:
Step 6 - Push an Image to the Harbor Registry
Having logged in to Harbor, use the following commands to pull an nginx image from Docker Hub and then push it to the Harbor registry:
You can see in Harbor’s UI that the image was successfully pushed:
And if you check DigitalOcean’s dashboard you can see that the provisioned Spaces bucket is being utilised:
Caveats & Mitigations
While exploring the configuration necessary for this high availability deployment I did encounter a couple of minor issues.
Three core components utilised by a Harbor for storage in a HA environment are PostgreSQL, Redis and PVCs or Object Storage.
PVCs vs Object Storage
Harbor requires ReadWriteMany (RWX) access mode if PVCs are to be used for image, chart and job log storage2. Currently, DigitalOcean’s CSI driver (which leverages DO’s block storage) does not support RWX, so a dedicated Spaces (DigitalOcean’s S3 API compatible object storage product) bucket is configured instead.
Connecting to Managed Redis
Redis wasn’t initially designed to be consumed outside a secure private network, so like most cloud providers DigitalOcean’s managed Redis product is accessible only via a secure connection. And since Harbor does not (currently) support secure connections to Redis, the module deploys a lightweight socat container as a DaemonSet to proxy the connection between Harbor and the managed Redis cluster.
Summary
So you now have a highly available Harbor Registry running on a DigitalOcean Kubernetes cluster. In the next tutorial, I’ll cover how to combine this module with another that deploys the Traefik proxy and a valid TLS cert. This enables external access to Harbor without the need to re-configure your Docker daemon.
As always if you spot any mistakes in this guide or have any suggestions for improvement please do comment below.