使用velero管理Kubernetes资源

背景

经常在开发测试环境碰到这样一种情况,yaml文件被修改了,或者service被人删除了。这个时候需要第一时间恢复业务,保障其开发测试的可用性,在生产环境中,备份显得尤为重要,任何一个误操作都有可能导致业务受到影响。

一、简介

是 VMWare 开源的 k8s 集群备份、迁移工具。可以帮助我们完成 k8s 的例行备份工作,以便在出现上面问题的时候可以快速进行恢复。同时也提供了集群迁移功能,可以将 k8s 资源迁移到其他 k8s 集群的功能。Velero 将集群资源保存在对象存储中,默认情况下可以使用 AWS、Azure、GCP 的对象存储。这里使用aws的插件实现腾讯云cos存储(兼容s3)。

特性

  • 集群备份
  • 备份调度
  • 备份钩子

二、前置条件

  • 安装好kubectl

  • 安装好kubernetes集群

  • 下载velero客户端

    • Mac可直接使用HomeBrew进行安装brew install velero

    • Linux从Github的Releases页面下载,进行解压安装tar -xvf <RELEASE-TARBALL-NAME>.tar.gz

    • Windows使用Chocolatey进行安装choco install velero

    • 通用安装方式,在Release页面下载对应平台的压缩包,解压到PATH环境变量下。

三、部署

这里以腾讯云cos为例

1.创建存储桶,需要开通对象存储cos服务

image-20200909232528500

2.创建凭据

这里的AK和SK登录腾讯云,通过【访问管理】-【访问密钥】-【API密钥管理】进行查看

1vim credentials-velero
2[default]
3aws_access_key_id = xxx
4aws_secret_access_key = xxx

3.安装脚本

 1vim velero-install.sh
 2velero install \
 3    --provider aws \
 4    --bucket bucket-name \
 5    --prefix "backup" \
 6    --namespace velero \
 7    --secret-file ./credentials-velero \
 8    --velero-pod-cpu-request 200m \
 9    --velero-pod-mem-request 200Mi \
10    --velero-pod-cpu-limit 200m \
11    --velero-pod-mem-limit 200Mi \
12    --use-volume-snapshots=false \
13    --use-restic \
14    --restic-pod-cpu-request 200m \
15    --restic-pod-mem-request 200Mi \
16    --restic-pod-cpu-limit 200m \
17    --restic-pod-mem-limit 200Mi \
18    --plugins velero/velero-plugin-for-aws:v1.1.0 \
19    --backup-location-config region=区域,s3ForcePathStyle="false",s3Url=http://cos.区域.myqcloud.com

4.执行安装

 1./velero-install.sh
 2CustomResourceDefinition/backups.velero.io: attempting to create resource
 3CustomResourceDefinition/backups.velero.io: created
 4CustomResourceDefinition/backupstoragelocations.velero.io: attempting to create resource
 5CustomResourceDefinition/backupstoragelocations.velero.io: created
 6CustomResourceDefinition/deletebackuprequests.velero.io: attempting to create resource
 7CustomResourceDefinition/deletebackuprequests.velero.io: created
 8CustomResourceDefinition/downloadrequests.velero.io: attempting to create resource
 9CustomResourceDefinition/downloadrequests.velero.io: created
10CustomResourceDefinition/podvolumebackups.velero.io: attempting to create resource
11CustomResourceDefinition/podvolumebackups.velero.io: created
12CustomResourceDefinition/podvolumerestores.velero.io: attempting to create resource
13CustomResourceDefinition/podvolumerestores.velero.io: created
14CustomResourceDefinition/resticrepositories.velero.io: attempting to create resource
15CustomResourceDefinition/resticrepositories.velero.io: created
16CustomResourceDefinition/restores.velero.io: attempting to create resource
17CustomResourceDefinition/restores.velero.io: created
18CustomResourceDefinition/schedules.velero.io: attempting to create resource
19CustomResourceDefinition/schedules.velero.io: created
20CustomResourceDefinition/serverstatusrequests.velero.io: attempting to create resource
21CustomResourceDefinition/serverstatusrequests.velero.io: created
22CustomResourceDefinition/volumesnapshotlocations.velero.io: attempting to create resource
23CustomResourceDefinition/volumesnapshotlocations.velero.io: created
24Waiting for resources to be ready in cluster...
25Namespace/velero: attempting to create resource
26Namespace/velero: created
27ClusterRoleBinding/velero: attempting to create resource
28ClusterRoleBinding/velero: created
29ServiceAccount/velero: attempting to create resource
30ServiceAccount/velero: created
31Secret/cloud-credentials: attempting to create resource
32Secret/cloud-credentials: created
33BackupStorageLocation/default: attempting to create resource
34BackupStorageLocation/default: created
35Deployment/velero: attempting to create resource
36Deployment/velero: created
37DaemonSet/restic: attempting to create resource
38DaemonSet/restic: created
39Velero is installed! ⛵ Use 'kubectl logs deployment/velero -n velero' to view the status.

5 查看

1kubectl get po -n velero
2NAME                     READY   STATUS    RESTARTS   AGE
3restic-f5jk8             1/1     Running   0          8s
4restic-kk7xj             1/1     Running   0          29s
5restic-qddwg             1/1     Running   0          38s
6restic-vn9gk             1/1     Running   0          35s
7velero-8fb9578d5-xkqbv   1/1     Running   0          39s

四、使用

1.创建备份

1velero create backup cloud-industry --include-namespaces cloud-industry
2Backup request "cloud-industry" submitted successfully.
3Run `velero backup describe cloud-industry` or `velero backup logs cloud-industry` for more details.

2.查看详情

 1velero backup describe cloud-industry
 2Name:         cloud-industry
 3Namespace:    velero
 4Labels:       velero.io/storage-location=default
 5Annotations:  velero.io/source-cluster-k8s-gitversion=v1.18.2
 6              velero.io/source-cluster-k8s-major-version=1
 7              velero.io/source-cluster-k8s-minor-version=18
 8
 9Phase:  Completed
10
11Errors:    0
12Warnings:  0
13
14Namespaces:
15  Included:  cloud-industry
16  Excluded:  <none>
17
18Resources:
19  Included:        *
20  Excluded:        <none>
21  Cluster-scoped:  auto
22
23Label selector:  <none>
24
25Storage Location:  default
26
27Velero-Native Snapshot PVs:  auto
28
29TTL:  720h0m0s
30
31Hooks:  <none>
32
33Backup Format Version:  1
34
35Started:    2020-09-09 22:32:47 +0800 CST
36Completed:  2020-09-09 22:33:19 +0800 CST
37
38Expiration:  2020-10-09 22:32:47 +0800 CST
39
40Total items to be backed up:  2305
41Items backed up:              2305
42
43Velero-Native Snapshots: <none included>

3.查看备份

  • 使用velero命令查看
1velero backup get
2NAME             STATUS      ERRORS   WARNINGS   CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
3cloud-industry   Completed   0        0          2020-09-09 22:32:47 +0800 CST   29d       default            <none>
  • 使用s3cmd查看
1s3cmd ls s3://yaml-backup-1257948216/backup/backups/cloud-industry/
22020-09-09 14:33           29  s3://bucket-name-1257948216/poc-h/backups/cloud-industry/cloud-industry-csi-volumesnapshotcontents.json.gz
32020-09-09 14:33           29  s3://bucket-name-1257948216/backup/backups/cloud-industry/cloud-industry-csi-volumesnapshots.json.gz
42020-09-09 14:33        55047  s3://bucket-name-1257948216/backup/backups/cloud-industry/cloud-industry-logs.gz
52020-09-09 14:33           29  s3://bucket-name-1257948216/backup/backups/cloud-industry/cloud-industry-podvolumebackups.json.gz
62020-09-09 14:33        20424  s3://bucket-name-1257948216/backup/backups/cloud-industry/cloud-industry-resource-list.json.gz
72020-09-09 14:33           29  s3://bucket-name-1257948216/backup/backups/cloud-industry/cloud-industry-volumesnapshots.json.gz
82020-09-09 14:33       368225  s3://bucket-name-1257948216/backup/backups/cloud-industry/cloud-industry.tar.gz
92020-09-09 14:33         1108  s3://bucket-name-1257948216/backup/backups/cloud-industry/velero-backup.json

4.还原

1velero restore create cloud-industry  --from-backup cloud-industry
2Restore request "cloud-industry" submitted successfully.
3Run `velero restore describe cloud-industry` or `velero restore logs cloud-industry` for more details.

五、高级用法

上面已经实现了手动备份和还原功能,下面我们schedule实现自动备份功能。

1.排除要备份的名称空间

1kubectl label -n kube-system namespace/kube-system velero.io/exclude-from-backup=true
2namespace/kube-system labeled
3kubectl label -n kube-public namespace/kube-public velero.io/exclude-from-backup=true
4namespace/kube-public labeled
5kubectl label -n default namespace/default velero.io/exclude-from-backup=true
6namespace/default labeled

2.使用schedule设置定时备份

1# 每天2点进行备份,保留最近3天的备份
2velero create schedule auto-backup --schedule="0 2 * * *" --ttl 72h
3Schedule "auto-backup" created successfully.

3.其他定时备份的示例

1# 每日1点进行备份
2velero create schedule <SCHEDULE NAME> --schedule="0 1 * * *"
3# 每日2点进行备份,备份保留72小时
4velero create schedule <SCHEDULE NAME> --schedule="0 2 * * *" --ttl 72h
5# 每6小时进行一次备份
6velero create schedule <SCHEDULE NAME> --schedule="@every 6h"
7# 每日对 web namespace 进行一次备份
8velero create schedule <SCHEDULE NAME> --schedule="@every 24h" --include-namespaces web

备注

定时备份的名称为:<SCHEDULE NAME>-<TIMESTAMP>,恢复命令为:velero restore create --from-backup <SCHEDULE NAME>-<TIMESTAMP>


4.查看定时备份

1velero schedule get
2NAME          STATUS    CREATED                         SCHEDULE    BACKUP TTL   LAST BACKUP   SELECTOR
3auto-backup   Enabled   2020-09-09 23:12:09 +0800 CST   0 2 * * *   72h0m0s      1m ago        <none>

六、参考

1.Velero Docs

2.使用 Velero 进行集群备份与迁移