我们有一个运行在Kubernetes上的PHP应用，每个POD由两个独立的容器组成 - Nginx和PHP-FPM。

在我们对应用进行缩容时，遇到了502错误，例如，当一个POD在结束中时，POD里面的容器无法正确关闭连接。

在这个博文中，让我们深入看一下POD的结束流程，特别是Nginx和PHP-FPM容器。

本文中的测试是在AWS Kubernetes Service上使用Yandex.Tank工具进行。

使用AWS ALB Ingress Controller创建Ingress并自动创建AWS Application Load Balancer。

Kubernetes工作节点上使用Docker作为容器运行时。

Pod的生命周期之Pod的结束

首先，让我们来看看pod结束的过程。

Pod其实是一组运行在Kubernetes工作节点上的进程，也受标准的IPC (Inter-Process Communication) 信号控制的。

为了让pod可以正常完成它的操作，容器运行时会先发送一个SIGTERM信号（优雅结束）给每个容器内的PID 1进程（参考docker stop）。同时，集群会开始计时，在grace period计时结束后，会发送SIGKILL信号直接杀掉pod。

在容器镜像中，可以使用STOPSIGNAL重写SIGTERM信号。

Pod删除的完整流程如下（以下引用自官方文档）：

当用户通过kubectl delete或kubectl scale deployment命令触发pod删除时，集群会同时开始grace period的计时（默认30秒）；
API server会把pod的状态从Running更新为Terminating（参考Container states）。Pod所在工作节点上的kubelet接收到pod状态变化后，开始了pod结束流程；如果pod里面有容器配置了preStophook，kubelet会执行它。假如30秒的grace period结束时，preStop hook还在执行，grace period会自动延长2秒钟。Grace period可以通过terminationGracePeriodSeconds配置。
当preStop hook完成时，kubelet会通知Docker运行时停止pod内的所有容器。Docker守护进程会发送SIGTERM信号给容器内的PID 1进程。所有容器收到信号的顺序是随机的。
在优雅结束开始的同时，kube-controller-manager会把pod从endpoints（参考Kubernetes – Endpoints）中移除，此时Service会停止往这个pod转发流量；
在grace period计时结束后，kubelet会强制停止容器 - Docker会发送SIGKILL信号给pod里面所有容器内的所有进程，此时进程不再有机会正常完成它们的操作，而是会被直接结束；
kubelet triggers deletion of the pod from the API server
Kubelet发送删除pod的请求给API server；
API server 把pod对应的记录从etcd中删除。

这里有两个问题：

Nginx和PHP-FPM把SIGTERM信号当作强制结束信号，并且会立刻结束进程，不再处理当前的连接而是立即关闭(参考 Controlling nginx 和 php-fpm(8) - Linux man page)
第2和第3步，也就是发送SIGTERM信号和移除endpoint是同时进行的，但实际上Ingress Controller可能没那么快就能够更新endpoints的数据，pod被kill掉时，ingress可能还在往pod转发流量，此时就会导致502错误的发生

例如，当Nginx主进程正在fast shutdown时，我们往nginx发送一个连接请求，nginx会直接丢弃这个连接请求，而我们的客户端则会接收到一个502错误，参考Avoiding dropped connections in nginx containers with “STOPSIGNAL SIGQUIT”。

NGINX `STOPSIGNAL` 和 502

好了，现在我们已经有了大概的了解，让我们开始来重现第一个问题。

以下的例子参考了上面的文档，并部署到kubernetes集群中。

准备好Dockerfile:

FROM nginx

RUN echo 'server {\n\

    listen 80 default_server;\n\

    location / {\n\

      proxy_pass      http://httpbin.org/delay/10;\n\

    }\n\

}' > /etc/nginx/conf.d/default.conf

CMD ["nginx", "-g", "daemon off;"]

在这里，nginx会把请求转发给http://httpbin.org并延迟10秒钟以模仿后端PHP应用。

构建一个镜像并推送到镜像仓库：

$ docker build -t setevoy/nginx-sigterm .

$ docker push setevoy/nginx-sigterm

现在，用这个镜像部署一个有10给实例的Deployment。

下面这个清单包括了Namespace, Service和Ingress，在接下来的测试中不再重复，只会提及需要更新的部分。

---

apiVersion: v1

kind: Namespace

metadata:

  name: test-namespace

---

apiVersion: apps/v1

kind: Deployment

metadata:

  name: test-deployment

  namespace: test-namespace

  labels:

    app: test

spec:

  replicas: 10

  selector:

    matchLabels:

      app: test

  template:

    metadata:

      labels:

        app: test

    spec:

      containers:

      - name: web

        image: setevoy/nginx-sigterm

        ports:

        - containerPort: 80

        resources:

          requests:

            cpu: 100m

            memory: 100Mi

        readinessProbe:

          tcpSocket:

            port: 80

---

apiVersion: v1

kind: Service

metadata:

  name: test-svc

  namespace: test-namespace

spec:

  type: NodePort

  selector:

    app: test

  ports:

    - protocol: TCP

      port: 80

      targetPort: 80

---

apiVersion: extensions/v1beta1

kind: Ingress

metadata:

  name: test-ingress

  namespace: test-namespace

  annotations:

    kubernetes.io/ingress.class: alb

    alb.ingress.kubernetes.io/scheme: internet-facing

    alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}]'

spec:

  rules:

  - http:

      paths:

      - backend:

          serviceName: test-svc

          servicePort: 80

部署：

$ kubectl apply -f test-deployment.yaml

namespace/test-namespace created

deployment.apps/test-deployment created

service/test-svc created

ingress.extensions/test-ingress created

检查Ingress:

$ curl -I aadca942-testnamespace-tes-5874–698012771.us-east-2.elb.amazonaws.com

HTTP/1.1 200 OK

现在有10个 pods在运行:

$ kubectl -n test-namespace get pod

NAME READY STATUS RESTARTS AGE

test-deployment-ccb7ff8b6–2d6gn 1/1 Running 0 26s

test-deployment-ccb7ff8b6–4scxc 1/1 Running 0 35s

test-deployment-ccb7ff8b6–8b2cj 1/1 Running 0 35s

test-deployment-ccb7ff8b6-bvzgz 1/1 Running 0 35s

test-deployment-ccb7ff8b6-db6jj 1/1 Running 0 35s

test-deployment-ccb7ff8b6-h9zsm 1/1 Running 0 20s

test-deployment-ccb7ff8b6-n5rhz 1/1 Running 0 23s

test-deployment-ccb7ff8b6-smpjd 1/1 Running 0 23s

test-deployment-ccb7ff8b6-x5dc2 1/1 Running 0 35s

test-deployment-ccb7ff8b6-zlqxs 1/1 Running 0 25s

为Yandex.Tank准备好load.yaml:

phantom:

  address: aadca942-testnamespace-tes-5874-698012771.us-east-2.elb.amazonaws.com

  header_http: "1.1"

  headers:

     - "[Host: aadca942-testnamespace-tes-5874-698012771.us-east-2.elb.amazonaws.com]"

  uris:

    - /

  load_profile:

    load_type: rps

    schedule: const(100,30m)

  ssl: false

console:

  enabled: true

telegraf:

  enabled: false

  package: yandextank.plugins.Telegraf

  config: monitoring.xml

这里，我们会以每秒一次的速率请求Ingress后端的pods。

开始测试：

到目前为止一切正常。

现在，把Deployment缩容到一个实例：

$ kubectl -n test-namespace scale deploy test-deployment — replicas=1

deployment.apps/test-deployment scaled

Pods状态变成Terminating:

$ kubectl -n test-namespace get pod

NAME READY STATUS RESTARTS AGE

test-deployment-647ddf455–67gv8 1/1 Terminating 0 4m15s

test-deployment-647ddf455–6wmcq 1/1 Terminating 0 4m15s

test-deployment-647ddf455-cjvj6 1/1 Terminating 0 4m15s

test-deployment-647ddf455-dh7pc 1/1 Terminating 0 4m15s

test-deployment-647ddf455-dvh7g 1/1 Terminating 0 4m15s

test-deployment-647ddf455-gpwc6 1/1 Terminating 0 4m15s

test-deployment-647ddf455-nbgkn 1/1 Terminating 0 4m15s

test-deployment-647ddf455-tm27p 1/1 Running 0 26m

…

此时，我们收到了502报错:

现在，我们更新一下Dockerfile - 添加STOPSIGNAL SIGQUIT：

FROM nginx

RUN echo 'server {\n\

    listen 80 default_server;\n\

    location / {\n\

      proxy_pass      http://httpbin.org/delay/10;\n\

    }\n\

}' > /etc/nginx/conf.d/default.conf

STOPSIGNAL SIGQUIT

CMD ["nginx", "-g", "daemon off;"]

构建并推送镜像：

$ docker build -t setevoy/nginx-sigquit .

docker push setevoy/nginx-sigquit

更新Deployment中的镜像：

...

    spec:

      containers:

      - name: web

        image: setevoy/nginx-sigquit

        ports:

        - containerPort: 80

...

重新部署并测试：

再次对deployment进行缩容:

$ kubectl -n test-namespace scale deploy test-deployment — replicas=1

deployment.apps/test-deployment scaled

这次不再报错：

Traffic, `preStop`, 和 `sleep`

但其实，如果我们重复测试的话，有时还是会有502错误：

这时，我们应该是遇到了第二个问题 - endpoints更新和SIGTERM同步发生的问题。

让我们加一个sleep的preStop hook，在集群接收到停止pod的请求之后，kubelet会先等待5秒钟后才发送SIGTERM信号，留一些冗余时间以更新endpoints和Ingress。

...

    spec:

      containers:

      - name: web

        image: setevoy/nginx-sigquit

        ports:

        - containerPort: 80

        lifecycle:

          preStop:

            exec:

              command: ["/bin/sleep","5"]

...

重新测试，这一次不再有错误了。

我们的PHP-FPM没有这个问题，因为它原本的镜像就已经添加了 STOPSIGNAL SIGQUIT。

其它可能的解决方案

当然，在调试期间我尝试了一些其它的方法。

具体请参考本文最后的参考文献，这里我只做简单的介绍。

`preStop` 和 `nginx -s quit`

其中一个方法就是在preStop hook中发送QUIT信号给Nginx：

lifecycle:

  preStop:

    exec:

      command:

      - /usr/sbin/nginx

      - -s

      - quit

或：

...

        lifecycle:

          preStop:

            exec:

              command:

              - /bin/sh

              - -SIGQUIT

              - 1

....

然并卵。虽然这个主意（在kubelet/docker发送TERM信号之情，先发送QUIT信号给nginx进程优雅结束）看上去没什么问题，但不知为啥不行。

你可以尝试通过strace看看nginx是否真的接收到QUIT信号了。

NGINX + PHP-FPM, `supervisord`, 和 `stopsignal`

我们这个应用是在一个pod里面运行两个容器，我也尝试过使用单个容器运行Nginx + PHP-FPM，例如 trafex/alpine-nginx-php7。

使用这个镜像并在supervisor.conf文件中给Nginx和PHP-FPM配置stopsignal sigquit，虽然想法看起来是对的，结果也是不行。

有兴趣的朋友可以试试。

PHP-FPM, 和 `process_control_timeout`

在 Graceful shutdown in Kubernetes is not always trivial 和 Stackoveflow 上的 Nginx / PHP FPM graceful stop (SIGQUIT): not so graceful 中提到，FPM’s master 进程先于子进程被杀也会导致502错误。

虽然这不是我们讨论的问题，但你可以关注 process_control_timeout.

NGINX, HTTP, 和keep-alive session

还有，在http头中加入 [Connection: close] 也是个不错的主意，这样子客户端就会在一个请求完成后关闭连接，从而减少502的发生。

但始终还是不能完全避免nginx在处理请求时接收到SIGTERM导致的问题。

参考 HTTP persistent connection.

参考文献

Graceful shutdown in Kubernetes is not always trivial (перевод на Хабре)
Gracefully Shutting Down Pods in a Kubernetes Cluster — the nginx -s quit in the preStop solution, also there is a good description of the issue with the traffic being sent to terminated pods
Kubernetes best practices: terminating with grace
Termination of Pods
Kubernetes’ dirty endpoint secret and Ingress
Avoiding dropped connections in nginx containers with “STOPSIGNAL SIGQUIT” — actually, here I’ve found our solution plus an idea of how to reproduce it

Originally published at RTFM: Linux, DevOps and system administration.

Kubernetes: NGINX/PHP-FPM 502错误和优雅结束的更多相关文章

nginx中的502错误
遇到这种情况,首先看一下慢日志 [17-Aug-2015 13:13:43] WARNING: [pool www] child 27780, script '/data/s.com/index.ph ...
NGINX 502错误排查（转）
一.NGINX 502错误排查 NGINX 502 Bad Gateway错误是FastCGI有问题,造成NGINX 502错误的可能性比较多.将网上找到的一些和502 Bad Gateway错误有关 ...
nginx和fpm的进程数配置和502，504错误
502 和 php-fpm.conf 1.php-cgi进程数不够用.php执行时间长,导致没有空闲进程处理新请求. 2.php-cgi进程死掉.php-fpm超时时间短,当前进程执行超时关闭连接. ...
Nginx 502错误触发条件与解决办法汇总(转载)
一些运行在Nginx上的网站有时候会出现“502 Bad Gateway”错误,有些时候甚至频繁的出现.有些站长是在刚刚转移到Nginx之后就出现了这个问题,所以经常会怀疑这是不是Nginx的问题,但 ...
nginx 502错误
一些运行在Nginx上的网站有时候会出现“502 Bad Gateway”错误,有些时候甚至频繁的出现.以下是小编搜集整理的一些Nginx 502错误的排查方法,供参考: Nginx 502错误的原因 ...
Nginx错误页面优雅显示
一.Nginx错误页面优雅显示的原因? 当我们访问网站时,由于特殊的原因,经常会出现诸如403,404,503等错误,这极大的影响用户的访问体验,所以我们很有必要做一下错误页面的优雅显示,以提升用 ...
Nginx 502错误总结
http请求流程:一般情况下,提交动态请求的时候,nginx会直接把请求转交给php-fpm,而php-fpm再分配php-cgi进程来处理相关的请求,之后再依次返回,最后由nginx把结果反馈给客 ...
nginx 代理服务器 502错误
在centos系统下,nginx做代理服务器总是出现502错误,百度各种搜索,出来的答案基本都是一样的,也不知道大家从哪抄的,问题也没有解决,最后还是从谷歌找到的答案: 总归还是centos系统的问 ...
nginx 502 错误
今天帮朋友处理一个程序报错,重启nginx服务之后,发现首页打不开了,但是静态文件可以打开经检查nginx 服务器正常运行,重启无数次仍然502错误,考虑到静态文件可以打开,怀疑可能是php 脚本程 ...

随机推荐

The Department of Redundancy Department
Write a program that will remove all duplicates from a sequence of integers and print the list of un ...
AtCoder Beginner Contest 174
第一次 ak ABC,纪念一下. 比赛链接:https://atcoder.jp/contests/abc174 A - Air Conditioner #include <bits/stdc+ ...
AtCoder Beginner Contest 161
比赛链接:https://atcoder.jp/contests/abc161/tasks AtCoder Beginner Contest 161 第一次打AtCoder的比赛,因为是日本的网站终于 ...
【noi 2.6_8464】股票买卖（DP）
题意:N天可买卖2次股票,问最大利润. 解法:f[i]表示前 i 天买卖一次的最大利润,g[i]表示后 i 天. 注意--当天可以又买又卖,不要漏了这个要求:数据较大. 1 #include<c ...
Tomacat目录以及服务器配置文件信息
一. 1.Tomacat的启动: 在我的windows10中我下载的是8.5版本的tomacat,我就是通过".sh"文件来打开和关闭tomacat 要打开.sh文件还需要这个G ...
Codeforces Round #651 (Div. 2) B. GCD Compression (构造)
题意:有一个长度为\(2n\)的数组,删去两个元素,用剩下的元素每两两相加构造一个新数组,使得新数组所有元素的\(gcd\ne 1\).输出相加时两个数在原数组的位置. 题解:我们按照新数组所有元素均 ...
使用 Nginx 在 Linux 上托管 ASP.NET Core
server { listen 80; server_name example.com *.example.com; location / { proxy_pass http://localhost: ...
js中for循环遍历的写法
众所周知,for循环是编程中必不可少的知识点:那么如何高效的写出循环呢? 我们要先知道for循环的基础样式是由自有变量自增自减和if判组成的: 1 for(条件){ 2 执行语句 3 } 而for循环 ...
Java15变量竟然没什么区别，八大基本数据类型你知道吗？
变量是什么? 变量是用来为不同数据类型在内存中分配的空间用来储存该数据. 不同于python这样的弱类型语言,变量声明不需要定义数据类型,就和写数学方程式一般,谁等于谁即可.而Java这个发展了多个版 ...
接口测试框架Requests
目录 Requests Requests安装 Requests常见接口请求方法构造请求目标构造 header构造 cookie 构造请求体 Get Query请求 Form请求参数 JSON请求体构 ...

Kubernetes: NGINX/PHP-FPM 502错误和优雅结束

Pod的生命周期之Pod的结束

NGINX STOPSIGNAL 和 502

Traffic, preStop, 和 sleep