k8s之StatefulSet

StatefulSet简介

Deployment不足以覆盖所有的应用编排问题,因为在它看来,一个应用的所有Pod,是完全一样的,所以它们之间就没有顺序,也无所谓运行在哪台宿主机上.需要时,Deployment就通过Pod模板创建新的Pod,不需要时,就"杀掉"任意一个Pod.
但是在实际场景中,并不是所有应用都满足这样的要求.比如:主从关系,主备关系,还有就是数据存储类应用,多个实例通常会在本地磁盘上保存一份数据,而这些实例一旦被杀掉,即使重建出来,实例与数据之间的对应关系也已经丢失,从而导致应用失败.
这种实例之间有不对等关系,或者有依赖关系的应用,被称为"有状态应用"(Stateful Application)
为了能对"有状态应用"做出支持,Kubernetes在Deployment基础上,扩展出了:StatefulSet.
StatefulSet是为了解决有状态服务的问题(对应Deployments和ReplicaSets是为无状态服务而设计),其应用场景包括
  •  稳定的持久化存储,即Pod重新调度后还是能访问到相同的持久化数据,基于PVC来实现
  • 稳定的网络标志,即Pod重新调度后其PodName和HostName不变,基于Headless Service(即没有Cluster IP的Service)来实现
  •  有序部署,有序扩展,即Pod是有顺序的,在部署或者扩展的时候要依据定义的顺序依次依次进行(即从0到N-1,在下一个Pod运行之前所有之前的Pod必须都是Running和Ready状态),基于init containers来实现
  • 有序收缩,有序删除(即从N-1到0)
StatefulSet由以下几个部分组成:
  • 用于定义网络标志(DNS domain)的Headless Service
  • 用于创建PersistentVolumes的volumeClaimTemplates
  • 定义具体应用的StatefulSet

consul集群示例

创建nfs的pv

# vim nfs_pv.yml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfs1.10.0.0.11
  labels:
    type: nfs
    pv: consul
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Recycle
  nfs:
    path: "/docker/nfs1"
    server: 10.0.0.11
    readOnly: false
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfs2.10.0.0.11
  labels:
    type: nfs
    pv: consul
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Recycle
  nfs:
    path: "/docker/nfs2"
    server: 10.0.0.11
    readOnly: false
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfs3.10.0.0.11
  labels:
    type: nfs
    pv: consul
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Recycle
  nfs:
    path: "/docker/nfs3"
    server: 10.0.0.11
    readOnly: false
# kubectl apply -f nfs_pv.yml
persistentvolume/nfs1.10.0.0.11 created
persistentvolume/nfs2.10.0.0.11 created
persistentvolume/nfs3.10.0.0.11 created

consul_svc.yml

# vim consul_svc.yml
apiVersion: v1
kind: Service
metadata:  
  name: consul  
  namespace: keyida
  labels:    
    name: consul
spec:  
  type: ClusterIP  
  ports:    
    - name: http      
      port: 8500      
      targetPort: 8500    
    - name: https      
      port: 8443      
      targetPort: 8443    
    - name: rpc      
      port: 8400      
      targetPort: 8400    
    - name: serflan-tcp      
      protocol: "TCP"      
      port: 8301      
      targetPort: 8301    
    - name: serflan-udp      
      protocol: "UDP"      
      port: 8301      
      targetPort: 8301    
    - name: serfwan-tcp      
      protocol: "TCP"      
      port: 8302      
      targetPort: 8302    
    - name: serfwan-udp      
      protocol: "UDP"      
      port: 8302      
      targetPort: 8302    
    - name: server      
      port: 8300      
      targetPort: 8300    
    - name: consuldns      
      port: 8600      
      targetPort: 8600  
  selector:    
    app: consul

consul.yml

# vim consul.yml
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  name: consul
  namespace: keyida
spec:
  serviceName: consul
  replicas: 3
  template:
    metadata:
      labels:
        app: consul
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: app
                    operator: In
                    values:
                      - consul
              topologyKey: kubernetes.io/hostname
      terminationGracePeriodSeconds: 10
      containers:
      - name: consul
        image: 10.0.0.11:81/library/consul:1.5.3
        imagePullPolicy: IfNotPresent
        args:
             - "agent"                  
             - "-server"                
             - "-bootstrap-expect=3"    # 组成集群需要的数量
             - "-ui"
             - "-data-dir=/consul/data" # consul持久化数据存储位置
             - "-bind=0.0.0.0"
             - "-client=0.0.0.0"
             - "-advertise=$(PODIP)"
             - "-retry-join=consul-0.consul.$(NAMESPACE).svc.cluster.local"   # dns规则
             - "-retry-join=consul-1.consul.$(NAMESPACE).svc.cluster.local"
             - "-retry-join=consul-2.consul.$(NAMESPACE).svc.cluster.local"
             - "-domain=cluster.local"
             - "-disable-host-node-id"
        volumeMounts:  
            - name: nfs.10.0.0.11
              mountPath: /consul/data
        volumeMounts:
        - name: host-time
          mountPath: /etc/localtime
        env:
            - name: PODIP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
            - name: NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
        ports:
            - containerPort: 8500
              name: ui-port
            - containerPort: 8400
              name: alt-port
            - containerPort: 53
              name: udp-port
            - containerPort: 8443
              name: https-port
            - containerPort: 8080
              name: http-port
            - containerPort: 8301
              name: serflan
            - containerPort: 8302
              name: serfwan
            - containerPort: 8600
              name: consuldns
            - containerPort: 8300
              name: server
      volumes:
      - name: host-time
        hostPath:
          path: /etc/localtime
  volumeClaimTemplates:
  - metadata:
      name: consul
      namespace: keyida
    spec:
      selector:
        matchLabels:
          pv: consul
      accessModes:
        - ReadWriteMany
      resources:
        requests:
          storage: 2Gi
-bootstrap-expect 是指组成集群需要最少节点,一般生产环境为3
--retry-join后面的dns规则,由于我们指定了pod名称为consul,并且有三个实例,因此它们的名称为consul-0,consul-1,consul-2,即便有节点失败,起来以后名称仍然是固定的,这样不管新起的podIp是多少,通过dns都能够正确解析到它
-data-dir consul持久化数据存储的位置.server节点的数据都是要持久化存储的,这个data-dir便是持久化数据存储的位置
volumeClaimTemplates 会自动创建pod独立的pv来储存自己的数据

ingress.yml

# vim consul_ui.yml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: cosul-ui
  namespace: keyida
  annotations:
    kubernetes.io/ingress.class: traefik
spec:
  rules:
  - host: consul.ui.com    
    http:
      paths:
      - backend:
          serviceName: consul 
          servicePort: 8500

创建集群

# kubectl apply -f .
statefulset.apps/consul created
service/consul created
ingress.extensions/cosul-ui created
查看状态
# kubectl get pod -n keyida
NAME       READY   STATUS    RESTARTS   AGE
consul-0   1/1     Running   0          3m42s
consul-1   1/1     Running   0          3m33s
consul-2   1/1     Running   0          3m21s
# kubectl get pvc -n keyida
NAME              STATUS   VOLUME           CAPACITY   ACCESS MODES   STORAGECLASS   AGE
consul-consul-0   Bound    nfs1.10.0.0.11   5Gi        RWX                           3m51s 
consul-consul-1   Bound    nfs2.10.0.0.11   5Gi        RWX                           3m39s
consul-consul-2   Bound    nfs3.10.0.0.11   5Gi        RWX                           3m27s

节点故障

如果集群中有节点挂掉后,存活的节点仍然符合法定数量(构建一个consul集群至少需要两个存活节点进行仲裁),由于statefulset规定的副本集的数量是3,因此k8s会保证有3个数量的副本集在运行,当k8s集群发现运行的副本数量少于规定数量时,便会根据调度策略重新启动一定数量pod以保证运行副本集数量和规定数量相符.由于在编排consul部署时使用了retry-join参数,因此有新增节点会自动尝试重新加入集群(传统方式是根据ip来加入集群),同样如果挂掉的是master节点也不用担心,如果存活节点数量仍然符合法定数量(这里的法定数量并不是指statefulset副本集数量,而是consul组成集群所需要最小节点数量),consul会依据一定策略重新选择master节点
《k8s之StatefulSet》
《k8s之StatefulSet》
点赞

发表评论

电子邮件地址不会被公开。

12 + 20 =