Harbor is a CNCF Foundation-hosted open source trusted cloud-native docker registry project that can be used to store, sign, and scan image content. Harbor extends the docker registry project by adding some common features such as security, identity rights management, etc. In addition, it also supports copying images between registries and provides more advanced security features such as user management, access control, and activity auditing, etc. Support for Helm repository hosting has also been added in the new version.

private registry

The core function of Harbor is to add a layer of permission protection to docker registry. To achieve this function, we need to intercept commands such as docker login, pull, push, etc., and perform some permission-related checks before performing the operation. In fact, this series of operations is already supported by docker registry v2. v2 integrates a security authentication feature, which exposes the security authentication to external services and lets them implement it.

Harbor Authentication Principles

We said above that docker registry v2 exposes the security authentication to external services, but how is it exposed? Let’s type docker login https://registry.qikqiak.com on the command line to illustrate the authentication process.

  • The docker client receives the docker login command from the user and translates it into a call to the engine api’s RegistryLogin method.
  • In the RegistryLogin method, the registry service’s auth method is called via http.
  • Since we are using the v2 version of the service here, we call the loginV2 method, which makes a /v2/ interface call that authenticates the request
  • At this point the request does not contain token information, the authentication will fail and return a 401 error, and the server address where to request authentication will be returned in the header
  • After the registry client receives the above return result, it will go to the returned authentication server for authentication request, and the header of the request sent to the authentication server contains the encrypted user name and password
  • The authentication server gets the encrypted user name and password from the header, and then it can combine with the actual authentication system for authentication, such as querying the user authentication information from the database or docking the ldap service for authentication verification.
  • After successful authentication, a token will be returned, and the client will take the returned token and send another request to the registry service, this time with the obtained token, and the request will be verified successfully, and the returned status code will be 200.
  • The docker client receives a status code of 200, which means the operation is successful, and prints the message Login Succeeded on the console. At this point, the whole login process is complete, and the whole process can be illustrated by the following flowchart.

flowchart

To complete the above login authentication process, there are two key points to note: how to let the registry service know the service authentication address? How can the registry recognize the token generated by the authentication service we provide?

For the first problem, it is relatively easy to solve, registry service itself provides a configuration file, you can start the registry service configuration file to specify on the authentication service address, which has the following paragraph configuration information.

1
2
3
4
5
6
7
8
......
auth:
  token:
    realm: token-realm
    service: token-service
    issuer: registry-token-issuer
    rootcertbundle: /root/certs/bundle
......

The realm can be used to specify the address of an authentication service, and we can see the contents of this configuration in Harbor below.

For the configuration of the registry, you can refer to the official documentation: https://docs.docker.com/registry/configuration/

The second question is, how can registry recognize the token file we return? If we generate a token according to the registry’s requirements, will the registry be able to recognize it? So we need to generate the token in our authentication server according to the registry’s requirements, not just randomly. So how do we generate it? We can see in the source code of docker registry that the token is implemented by JWT (JSON Web Token), so we can generate a JWT token as required.

If you are familiar with golang, you can clone Harbor’s code to see it. Harbor uses beego, a web development framework, so the source code is not particularly difficult to read. We can easily see how Harbor implements the authentication service part we explained above.

Harbor

Installation

Harbor involves more components, so we can use Helm to install a highly available version of Harbor that also matches the way production environments are deployed. Before installing a highly available version, we need the following prerequisites.

  • Kubernetes cluster version 1.10+
  • Helm version 2.8.0+
  • Highly available Ingress controller
  • Highly available PostgreSQL 9.6+ (Harbor does not do database HA deployments)
  • Highly available Redis service (not handled by Harbor)
  • PVCs that can be shared across nodes or external object stores

Most of Harbor’s components are stateless, so we can simply add copies of Pods to ensure that the components are distributed to as many nodes as possible. In the storage layer, we need to provide our own highly available PostgreSQL and Redis clusters to store application data, as well as PVC or object storage for storage images and Helm Chart.

load balance

First add the Chart repository address.

1
2
3
4
5
6
# 添加 Chart 仓库
$ helm repo add harbor https://helm.goharbor.io
# 更新
$ helm repo update
# 拉取1.9.2版本并解压
$ helm pull harbor/harbor --untar --version 1.9.2

There are many parameters that can be configured when installing Harbor, which can be viewed on the harbor-helm project. During installation we can specify the parameters with -set or edit the Values file directly with values.yaml.

  • Ingress configured via expose.ingress.hosts.core and expose.ingress.hosts.notary.
  • External URL via configuration externalURL.
  • External PostgreSQL by configuring database.type to be external and then supplementing it with database.external information. We need to create 3 empty data manually: Harbor core, Notary server and Notary signer, Harbor will create the table structure automatically at startup.
  • External Redis By configuring redis.type to be external and populating the redis.external section, Harbor introduced Sentinel mode for redis in version 2.1.0, which you can enable by configuring sentinel_master_set You can enable it by configuring sentinel_master_set, and the host address can be set to <host_sentinel1>:<port_sentinel1>,<host_sentinel2>:<port_sentinel2>,<host_sentinel3>:<port_sentinel3>. See also the documentation at https://community.pivotal.io/s/article/How-to-setup-HAProxy-and-Redis-Sentinel-for-automatic-failover-between-Redis-Master-and-Slave-servers Configure a HAProxy in front of Redis to expose a single entry point.
  • Storage, by default, requires a default StorageClass in the K8S cluster to automatically generate PVs for storing images, charts, and task logs. If you want to specify StorageClass, you can do so via persistence.persistentVolumeClaim.registry.storageClass, persistence.persistentVolumeClaim.chartmuseum. storageClass and persistence.persistentVolumeClaim.jobservice.storageClass. You also need to set the accessMode to ReadWriteMany to ensure that PVs can be shared across different nodes. storage across different nodes. We can also specify existing PVCs to store data, which can be configured with existingClaim. If you don’t have a PVC that can be shared across nodes, you can use external storage to store images and Chart (external storage supported: azure, gcs, s3 swift and oss) and store task logs in the database. Set persistence.imageChartStorage.type to the value you want to use and populate the appropriate section and set jobservice.jobLogger to database.
  • Replicas: by setting portal.replicascore.replicasjobservice.replicasregistry.replicaschartmuseum.replicasnotary.server.replicas and notary.signer.replicas to n (n> = 2

For example, here we have harbor.k8s.local as our primary domain, which provides storage through an nfs-client StorageClass, and since we installed GitLab with two separate databases, postgresql and reids, we can also configure Harbor uses these two external databases, which reduces resource usage (we can assume that both databases are in HA mode). But to use the external databases, you need to create the databases manually. For example, if you’re using the GitLab database, go into the Pod and create harbor, notary_server, and notary_signer.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
$ kubectl get pods -n kube-ops -l name=postgresql
NAME                          READY   STATUS    RESTARTS   AGE
postgresql-75b8447fb5-th6bw   1/1     Running   1          2d
$ kubectl exec -it postgresql-75b8447fb5-th6bw /bin/bash -n kube-ops
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
root@postgresql-75b8447fb5-th6bw:/var/lib/postgresql# sudo su - postgres
postgres@postgresql-75b8447fb5-th6bw:~$ psql
psql (12.3 (Ubuntu 12.3-1.pgdg18.04+1))
Type "help" for help.

postgres=# CREATE DATABASE harbor OWNER postgres;
CREATE DATABASE
postgres=# GRANT ALL PRIVILEGES ON DATABASE harbor to postgres;
GRANT
postgres=# GRANT ALL PRIVILEGES ON DATABASE harbor to gitlab;
GRANT
# Todo: 用同样的方式创建其他两个数据库:notary_server、notary_signer
......
postgres-# \q  # 退出

Once the database has been prepared, it can be installed using our own custom values file, the complete custom values file is shown below.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
# values-prod.yaml
externalURL: https://harbor.k8s.local
harborAdminPassword: Harbor12345
logLevel: debug

expose:
  type: ingress
  tls:
    enabled: true
  ingress:
    className: nginx  # 指定 ingress class
    hosts:
      core: harbor.k8s.local
      notary: notary.k8s.local

persistence:
  enabled: true
  resourcePolicy: "keep"
  persistentVolumeClaim:
    registry:
      # 如果需要做高可用,多个副本的组件则需要使用支持 ReadWriteMany 的后端
      # 这里我们使用nfs,生产环境不建议使用nfs
      storageClass: "nfs-client"
      # 如果是高可用的,多个副本组件需要使用 ReadWriteMany,默认为 ReadWriteOnce
      accessMode: ReadWriteMany
      size: 5Gi
    chartmuseum:
      storageClass: "nfs-client"
      accessMode: ReadWriteMany
      size: 5Gi
    jobservice:
      storageClass: "nfs-client"
      accessMode: ReadWriteMany
      size: 1Gi
    trivy:
      storageClass: "nfs-client"
      accessMode: ReadWriteMany
      size: 2Gi

database:
  type: external
  external:
    host: "postgresql.kube-ops.svc.cluster.local"
    port: "5432"
    username: "gitlab"
    password: "passw0rd"
    coreDatabase: "harbor"
    notaryServerDatabase: "notary_server"
    notarySignerDatabase: "notary_signer"

redis:
  type: external
  external:
    addr: "redis.kube-ops.svc.cluster.local:6379"

# 默认为一个副本,如果要做高可用,只需要设置为 replicas >= 2 即可
portal:
  replicas: 1
core:
  replicas: 1
jobservice:
  replicas: 1
registry:
  replicas: 1
chartmuseum:
  replicas: 1
trivy:
  replicas: 1
notary:
  server:
    replicas: 1
  signer:
    replicas: 1

This configuration information is overwritten by the default values of Harbor’s Chart package, which we can now install directly.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
$ cd harbor
$ helm upgrade --install harbor . -f values-prod.yaml -n kube-ops
Release "harbor" does not exist. Installing it now.
NAME: harbor
LAST DEPLOYED: Thu Jul  7 17:31:43 2022
NAMESPACE: kube-ops
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Please wait for several minutes for Harbor deployment to complete.
Then you should be able to visit the Harbor portal at https://harbor.k8s.local
For more details, please visit https://github.com/goharbor/harbor

Under normal circumstances the installation will be successful in a short interval.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
$ helm ls -n kube-ops
NAME    NAMESPACE       REVISION        UPDATED                                 STATUS          CHART           APP VERSION
harbor  kube-ops        1               2022-07-07 17:31:43.083547 +0800 CST    deployed        harbor-1.9.2    2.5.2
$ kubectl get pods -n kube-ops -l app=harbor
NAME                                    READY   STATUS    RESTARTS      AGE
harbor-chartmuseum-544ddbcb64-nvk7w     1/1     Running   0             30m
harbor-core-7fd9964685-lqqw2            1/1     Running   0             30m
harbor-jobservice-6dbd89c59-vvzx5       1/1     Running   0             30m
harbor-notary-server-764b8859bf-82f5q   1/1     Running   0             30m
harbor-notary-signer-869d9bf585-kbwwg   1/1     Running   0             30m
harbor-portal-74db6bb688-2w79p          1/1     Running   0             35m
harbor-registry-695db89bfd-v9wwt        2/2     Running   0             30m
harbor-trivy-0                          1/1     Running   0             35m

Once the installation is complete, we can resolve the domain name harbor.k8s.local to the Ingress Controller traffic entry point and then access it in the browser via that domain name.

1
2
3
4
$ kubectl get ingress -n kube-ops
NAME                                      CLASS         HOSTS                                               ADDRESS         PORTS     AGE
harbor-ingress                            nginx         harbor.k8s.local                                                    80, 443   12s
harbor-ingress-notary                     nginx         notary.k8s.local       

The user name is the default admin, and the password is the default Harbor12345 configured above. It is important to note that you should use https for access (the default will also jump to https), otherwise the login may prompt a username or password error.

Harbor

Once you have logged in, you can access Harbor’s Dashboard page.

Harbor

We can see that there are many functions, by default there will be a project named library, which is publicly accessible by default, and you can see that there is also Helm Chart package management inside the project, you can upload it manually here, and you can also do some other configurations for the images inside the project.

Push images

Next, let’s test how to use the Harbor image repository in containerd.

First we need to configure the private image repository into containerd by modifying the containerd configuration file /etc/containerd/config.toml.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
[plugins."io.containerd.grpc.v1.cri".registry]
  [plugins."io.containerd.grpc.v1.cri".registry.mirrors]
    [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
      endpoint = ["https://bqr1dr1n.mirror.aliyuncs.com"]
  [plugins."io.containerd.grpc.v1.cri".registry.configs]
    [plugins."io.containerd.grpc.v1.cri".registry.configs."harbor.k8s.local".tls]
      insecure_skip_verify = true
    [plugins."io.containerd.grpc.v1.cri".registry.configs."harbor.k8s.local".auth]
      username = "admin"
      password = "Harbor12345"

Add the configuration information corresponding to harbor.k8s.local under plugins. "io.containerd.grpc.v1.cri".registry.configs, insecure_skip_verify = true to skip the security checks, and then pass plugins."io.containerd.grpc.v1.cri".registry.configs."harbor.k8s.local".auth to configure the username and password for the Harbor image repository.

Restart containerd after the configuration is complete.

1
$ systemctl restart containerd

Now we use nerdctl to log in.

1
2
3
4
5
$ nerdctl login -u admin harbor.k8s.local
Enter Password:
ERRO[0004] failed to call tryLoginWithRegHost            error="failed to call rh.Client.Do: Get \"https://harbor.k8s.local/v2/\": x509: certificate signed by unknown authority" i=0
FATA[0004] failed to call rh.Client.Do: Get "https://harbor.k8s.local/v2/": x509: certificate signed by unknown authority
[root@master1 ~]#

You can see that it still reports certificate related errors, just add a --insecure-registry parameter to solve the problem.

1
2
3
4
5
6
7
8
$ nerdctl login -u admin --insecure-registry harbor.k8s.local
Enter Password:
WARN[0004] skipping verifying HTTPS certs for "harbor.k8s.local"
WARNING: Your password will be stored unencrypted in /root/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded

Then we start by pulling a random image.

1
2
3
4
5
6
7
$ nerdctl pull busybox:1.35.0
docker.io/library/busybox:1.35.0:                                                 resolved       |++++++++++++++++++++++++++++++++++++++|
index-sha256:8c40df61d40166f5791f44b3d90b77b4c7f59ed39a992fd9046886d3126ffa68:    done           |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:8cde9b8065696b65d7b7ffaefbab0262d47a5a9852bfd849799559d296d2e0cd: done           |++++++++++++++++++++++++++++++++++++++|
config-sha256:d8c0f97fc6a6ac400e43342e67d06222b27cecdb076cbf8a87f3a2a25effe81c:   done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:fc0cda0e09ab32c72c61d272bb409da4e2f73165c7bf584226880c9b85438e63:    done           |++++++++++++++++++++++++++++++++++++++|
elapsed: 83.7s

Then retag the image to the address of the image on Harbor.

1
$ nerdctl tag busybox:1.35.0 harbor.k8s.local/library/busybox:1.35.0

Then execute the push command to push the image to Harbor.

1
2
3
4
5
6
7
$ nerdctl push --insecure-registry harbor.k8s.local/library/busybox:1.35.0
INFO[0000] pushing as a reduced-platform image (application/vnd.docker.distribution.manifest.list.v2+json, sha256:29fe0126b13c3ea2641ca42c450fa69583d212dbd9b7b623814977b5b0945726)
WARN[0000] skipping verifying HTTPS certs for "harbor.k8s.local"
index-sha256:29fe0126b13c3ea2641ca42c450fa69583d212dbd9b7b623814977b5b0945726:    done           |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:8cde9b8065696b65d7b7ffaefbab0262d47a5a9852bfd849799559d296d2e0cd: done           |++++++++++++++++++++++++++++++++++++++|
config-sha256:d8c0f97fc6a6ac400e43342e67d06222b27cecdb076cbf8a87f3a2a25effe81c:   done           |++++++++++++++++++++++++++++++++++++++|
elapsed: 6.9 s      

Once the push is complete, we can see the information about this image on the Portal page.

Portal page

The image push succeeds, and you can also test the pull.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
$ nerdctl rmi harbor.k8s.local/library/busybox:1.35.0
Untagged: harbor.k8s.local/library/busybox:1.35.0@sha256:8c40df61d40166f5791f44b3d90b77b4c7f59ed39a992fd9046886d3126ffa68
Deleted: sha256:cf4ac4fc01444f1324571ceb0d4f175604a8341119d9bb42bc4b2cb431a7f3a5
$ nerdctl rmi busybox:1.35.0
Untagged: docker.io/library/busybox:1.35.0@sha256:8c40df61d40166f5791f44b3d90b77b4c7f59ed39a992fd9046886d3126ffa68
Deleted: sha256:cf4ac4fc01444f1324571ceb0d4f175604a8341119d9bb42bc4b2cb431a7f3a5
$ nerdctl pull --insecure-registry  harbor.k8s.local/library/busybox:1.35.0
WARN[0000] skipping verifying HTTPS certs for "harbor.k8s.local"
harbor.k8s.local/library/busybox:1.35.0:                                          resolved       |++++++++++++++++++++++++++++++++++++++|
index-sha256:29fe0126b13c3ea2641ca42c450fa69583d212dbd9b7b623814977b5b0945726:    done           |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:8cde9b8065696b65d7b7ffaefbab0262d47a5a9852bfd849799559d296d2e0cd: done           |++++++++++++++++++++++++++++++++++++++|
config-sha256:d8c0f97fc6a6ac400e43342e67d06222b27cecdb076cbf8a87f3a2a25effe81c:   done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:fc0cda0e09ab32c72c61d272bb409da4e2f73165c7bf584226880c9b85438e63:    done           |++++++++++++++++++++++++++++++++++++++|
elapsed: 0.7 s                                                                    total:  2.2 Ki (3.2 KiB/s)

$ nerdctl images
REPOSITORY                          TAG       IMAGE ID        CREATED           PLATFORM       SIZE         BLOB SIZE
harbor.k8s.local/library/busybox    1.35.0    29fe0126b13c    17 seconds ago    linux/amd64    1.2 MiB      757.7 KiB

But above we can also see that using containerd alone, for example accessing the Harbor image repository via nerdctl or ctr commands will result in a certificate error even if you skip the certificate checksum or configure a CA certificate, so we need to skip the certificate checksum or specify a certificate path to do so.

1
2
3
4
5
# 解决办法1.指定 -k 参数跳过证书校验。
$ ctr images pull --user admin:Harbor12345 -k harbor.k8s.local/library/busybox:1.35.0

# 解决办法2.指定CA证书、Harbor 相关证书文件路径。
$ ctr images pull --user admin:Harbor12345 --tlscacert ca.crt harbor.k8s.local/library/busybox:1.35.0

However, if you use ctrctl directly, it is valid.

1
2
$ crictl pull harbor.k8s.local/library/busybox@sha256:29fe0126b13c3ea2641ca42c450fa69583d212dbd9b7b623814977b5b0945726
Image is up to date for sha256:d8c0f97fc6a6ac400e43342e67d06222b27cecdb076cbf8a87f3a2a25effe81c

If you want to use it in Kubernetes then you need to add Harbor authentication information to the cluster in the form of a Secret.

1
$ kubectl create secret docker-registry harbor-auth --docker-server=https://harbor.k8s.local --docker-username=admin --docker-password=Harbor12345 --docker-email=info@ydzs.io -n default

Then we use the private image repository above to create a Pod.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# test-harbor.yaml
apiVersion: v1
kind: Pod
metadata:
  name: harbor-registry-test
spec:
  containers:
  - name: test
    image: harbor.k8s.local/library/busybox:1.35.0
    args:
    - sleep
    - "3600"
  imagePullSecrets:
  - name: harbor-auth

Once created, you can see if the Pod is getting images properly.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
$ kubectl describe pod harbor-registry-test
Name:         harbor-registry-test
Namespace:    default
Priority:     0
Node:         node1/192.168.0.107
Start Time:   Thu, 07 Jul 2022 18:52:39 +0800
# ......
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  10s   default-scheduler  Successfully assigned default/harbor-registry-test to node1
  Normal  Pulling    10s   kubelet            Pulling image "harbor.k8s.local/library/busybox:1.35.0"
  Normal  Pulled     5s    kubelet            Successfully pulled image "harbor.k8s.local/library/busybox:1.35.0" in 4.670528883s
  Normal  Created    5s    kubelet            Created container test
  Normal  Started    5s    kubelet            Started container test

Here proves that our private image repository is successfully built, you can try to create a private project, and then create a new user, use this user to pull/push the image, Harbor also has some other features, such as image replication, Helm Chart package hosting, etc., you can test yourself, feel the difference between Harbor and the official registry repository comes with.