Rook Ceph部署

在k8s环境中部署rook ceph

安装rook-ceph

ubuntu Ubuntu 20.04.3 LTS
docker 20.10.9
docker-compose 1.29.2
ceph 16.2.6 pacific (stable)
账号 密码
admin password ceph dashboard

署信息

IP 部署组件
192.168.22.161 docker/docker-compose/harbor
192.168.22.162 rancher
192.168.22.163 master
192.168.22.164 rancher node
192.168.22.165 rancher node
192.168.22.166 rancher node
192.168.22.167 node
192.168.22.168 node
192.168.22.169 node
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
git clone --single-branch --branch release-1.7 https://github.com/rook/rook.git
cd rook/cluster/examples/kubernetes/ceph/
kubectl create -f crds.yaml -f common.yaml -f operator.yaml
# verify the rook-ceph-operator is in the `Running` state before proceeding 
kubectl -n rook-ceph get pod
#创建集群
kubectl create -f cluster.yaml
kubectl -n rook-ceph get pod ##可能会出现拉取镜像失败的情况,会自动重试,需要等一下。如果出错,查看👇🏻引用的官方排错说明


kubectl create -f dashboard-external-https.yaml #将管理端口类型改为NodePod暴露出来


kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo #获取Dashboard密码

cd rook/cluster/examples/kubernetes/ceph/csi/rbd
kubectl create -f storageclass.yaml
kubectl create -f snapshotclass.yaml
kubectl get sc #查看storageClass
kubectl patch storageclasses rook-ceph-block -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}' #设置为默认

安装Rook Toolbox

Rook Toolbox是一个运行在rook-ceph命名空间下的容器,通过它可以执行一些Ceph的管理任务,建议安装,还是挺实用的。命令执行目录rook/cluster/examples/kubernetes/ceph/

kubectl create -f toolbox.yaml

接着可以执行下面的命令,进入Rook Toolbox容器:

kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash

然后使用ceph status命令来查看集群的状态。正常的话可以看到类似下面的结果:

# ceph status
  cluster:
    id:     82c597dd-bcf3-4304-86b1-fabe89d31f1b
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum a,b,c (age 16h)
    mgr: a(active, since 16h)
    osd: 3 osds: 3 up (since 16h), 3 in (since 16h)

  data:
    pools:   2 pools, 160 pgs
    objects: 1 objects, 19 B
    usage:   20 MiB used, 87 GiB / 87 GiB avail
    pgs:     160 active+clean

一定要确保health的状态为HEALTH_OK,如果不是HEALTH_OK,则需要排查原因并解决。查看👇🏻引用的官方排错说明

配置external-snapshotter

1
git clone https://github.com/kubernetes-csi/external-snapshotter/

进入目标目录

1
2
~/external-snapshotter/client/config/crd$ ls
snapshot.storage.k8s.io_volumesnapshotclasses.yaml  snapshot.storage.k8s.io_volumesnapshotcontents.yaml  snapshot.storage.k8s.io_volumesnapshots.yaml

创建volumesnapshot

1
2
3
kubectl create -f snapshot.storage.k8s.io_volumesnapshotclasses.yaml
kubectl create -f snapshot.storage.k8s.io_volumesnapshotcontents.yaml
kubectl create -f snapshot.storage.k8s.io_volumesnapshots.yaml
1
2
~/external-snapshotter/deploy/kubernetes/snapshot-controller$ ls
rbac-snapshot-controller.yaml  setup-snapshot-controller.yaml
kubectl create -f rbac-snapshot-controller.yaml

修改setup-snapshot-controller.yaml文件,将gcr镜像地址改为阿里云

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
# This YAML file shows how to deploy the snapshot controller

# The snapshot controller implements the control loop for CSI snapshot functionality.
# It should be installed as part of the base Kubernetes distribution in an appropriate
# namespace for components implementing base system functionality. For installing with
# Vanilla Kubernetes, kube-system makes sense for the namespace.

---
kind: Deployment
apiVersion: apps/v1
metadata:
  name: snapshot-controller
  namespace: kube-system
spec:
  replicas: 2
  selector:
    matchLabels:
      app: snapshot-controller
  # the snapshot controller won't be marked as ready if the v1 CRDs are unavailable
  # in #504 the snapshot-controller will exit after around 7.5 seconds if it
  # can't find the v1 CRDs so this value should be greater than that
  minReadySeconds: 15
  strategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: snapshot-controller
    spec:
      serviceAccount: snapshot-controller
      containers:
        - name: snapshot-controller
          #image: k8s.gcr.io/sig-storage/snapshot-controller:v4.2.1
          image: registry.aliyuncs.com/google_containers/snapshot-controller:v4.2.1
          args:
            - "--v=5"
            - "--leader-election=true"
          imagePullPolicy: IfNotPresent

配置snapshot-controller

kubectl create -f setup-snapshot-controller.yaml

使用kubestr进行验证

wget https://github.com/kastenhq/kubestr/releases/download/v0.4.23/kubestr-v0.4.23-linux-amd64.tar.gz
tar zxf kubestr-v0.4.23-linux-amd64.tar.gz
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
~$ ./kubestr

**************************************
  _  ___   _ ___ ___ ___ _____ ___
  | |/ / | | | _ ) __/ __|_   _| _ \
  | ' <| |_| | _ \ _|\__ \ | | |   /
  |_|\_\\___/|___/___|___/ |_| |_|_\

Explore your Kubernetes storage options
**************************************
Kubernetes Version Check:
  Valid kubernetes version (v1.22.0)  -  OK

RBAC Check:
  Kubernetes RBAC is enabled  -  OK

Aggregated Layer Check:
  The Kubernetes Aggregated Layer is enabled  -  OK

Available Storage Provisioners:

  rook-ceph.rbd.csi.ceph.com:
    Missing CSIDriver Object. Required by some provisioners.
    This is a CSI driver!
    (The following info may not be up to date. Please check with the provider for more information.)
    Provider:            Ceph RBD
    Website:             https://github.com/ceph/ceph-csi
    Description:         A Container Storage Interface (CSI)  Driver for Ceph RBD
    Additional Features: Raw Block, Snapshot, Expansion, Topology, Cloning

    Storage Classes:
      * rook-ceph-block
    Volume Snapshot Classes:
      * csi-rbdplugin-snapclass

    To perform a FIO test, run-
      ./kubestr fio -s <storage class>

    To test CSI snapshot/restore functionality, run-
      ./kubestr csicheck -s <storage class> -v <volume snapshot class>

附录

1
2
for i in `ls /sys/class/scsi_host/` ;do sudo echo "- - -" > /sys/class/scsi_host/$i/scan;done #添加磁盘无需重启
echo 1 > /sys/bus/scsi/drivers/sd/32\:0\:1\:0/delete #移除磁盘无需重启,注意核实操作目标之后操作

参考链接:

Kubernetes中分布式存储Rook-Ceph部署快速演练

快速上手 Rook,入门云原生存储编排(中英缝合版)

Ceph Dashboard

官方排错说明

updatedupdated2021-10-252021-10-25