RabbitMQ 消息队列的 Nginx 反向代理

对于 http 协议,大家都知道可以使用 nginx 进行反向代理,那么,对于 RabbitMQ 等有状态的长连接呢?配是能配,但是主要会关心长连接超时的问题。

先来看看配置是怎么配的

$ cat  nginx.conf
#user  nobody;
worker_processes  1;

#error_log  /var/logs/error.log;
#error_log  /var/logs/error.log  notice;
error_log  /var/logs/error.log  info;

pid        /var/run/nginx.pid;

events {
    worker_connections  1024;
}

stream {
    upstream rabbitmq{
        server xxxxxx:5672;
    }
    server {
        listen 5672;
        proxy_connect_timeout 1s;
        proxy_timeout 3s;
        proxy_pass rabbitmq;

    }  
}

可以看到这里用到一个叫 stream 的东西,这个东西是 nginx 新出用来搞 tcp 代理的, http://nglua.com/reads/15.html 这里有对他的说明和源码分析

nginx因为在http服务的优异表现被大众认可,但是它不仅仅是一个http服务器,也是mail代理服务器。现在这个家庭加入了新的成员tcp。其实它在nginx官网的说辞里叫stream,之所以呈现在大家面前是tcp的原因,我猜测是这样的:
a:它解决了代理需求,而这需求就是tcp代理,但是它的源码里命名是stream,可以说看不到tcp。
b:确实有第三方模块(ngx_tcp_proxy_module)做了同样的事,这个优秀的模块早好些时间,有兴趣的可以去github上看看。
c:tcp比stream容易上口,这好像有点废话,反正就是这样了。
为什么强调stream呢,stream和http是同级的,http本身有proxy功能,现在stream的核心功能也是proxy。但预料以后stream的功能将更加丰富强大,所以重申nginx有个stream的东西,因为它才具备tcp代理功能。

https://www.zybuluo.com/orangl… 这里也有一些相关的说明

我们的配置中涉及到 proxy_connect_timeout 和 proxy_timeout 两个超时相关的配置项, https://my.oschina.net/xsh1208… 这里提到 proxy_connect_timeout 不能超过 75 秒,不过在 https://nginx.org/en/docs/stre…https://nginx.org/en/docs/stre… 这两个官方的文档中倒是都没有提到

先写一个 python 的 RabbitMQ 客户端,尝试去读写数据,都能正确连上,但是测试发现一旦闲置超过 1 秒,连接就被断开了,如下:

 [x] Received 'Hello World!'
 [x] Received 'Hello World!'
 [x] Received 'Hello World!'
 [x] Received 'Hello World!'
 [x] Received 'Hello World!'
 [x] Received 'Hello World!'
 [x] Received 'Hello World!'
Traceback (most recent call last):
  File "mq-receive.py", line 19, in <module>
    channel.start_consuming()
  File "/usr/lib/python2.6/site-packages/pika-0.10.0-py2.6.egg/pika/adapters/blocking_connection.py", line 1681, in start_consuming
    self.connection.process_data_events(time_limit=None)
  File "/usr/lib/python2.6/site-packages/pika-0.10.0-py2.6.egg/pika/adapters/blocking_connection.py", line 647, in process_data_events
    self._flush_output(common_terminator)
  File "/usr/lib/python2.6/site-packages/pika-0.10.0-py2.6.egg/pika/adapters/blocking_connection.py", line 426, in _flush_output
    raise exceptions.ConnectionClosed()
pika.exceptions.ConnectionClosed

于是考虑调整 nginx 的超时配置,简单粗暴的方法当然是配置为从不超时,但是这里有一个问题,就是我们的接入层的超时不应该这么宽松,官方给这些超时配置项的默认值也是到分钟级别的,不过,如果按照这个默认的超时配置值,随之而来的一个问题就是,RabbitMQ 本身是否有心跳呢?如果 RabbitMQ 本身没有心跳,那么在队列不活跃的时候,很可能就被杀掉了

https://www.rabbitmq.com/heart… 这里可以看到

Detecting Dead TCP Connections with Heartbeats

Introduction

Network can fail in many ways, sometimes pretty subtle (e.g. high ratio packet loss). Disrupted TCP connections take a moderately long time (about 11 minutes with default configuration on Linux, for example) to be detected by the operating system. AMQP 0-9-1 offers a heartbeat feature to ensure that the application layer promptly finds out about disrupted connections (and also completely unresponsive peers). Heartbeats also defend against certain network equipment which may terminate “idle” TCP connections.

Heartbeat Timeout Interval

The heartbeat timeout value defines after what period of time the peer TCP connection should be considered dead by RabbitMQ and client libraries. This value is negotiated between the client and RabbitMQ server at the time of connection. The client must be configured to request heartbeats. In RabbitMQ versions 3.0 and higher, the broker will attempt to negotiate heartbeats by default (although the client can still veto them). The timeout is in seconds, and default value is 60 (580 prior to release 3.5.5).

Heartbeat frames are sent about every timeout / 2 seconds. After two missed heartbeats, the peer is considered to be unreachable. Different clients manifest this differently but the TCP connection will be closed. When a client detects that RabbitMQ node is unreachable due to a heartbeat, it needs to re-connect.

Heartbeats can be disabled by setting the timeout interval to 0.

官方是有给 RabbitMQ 3.0 以上的版本支持心跳的,我们用的也是 3.4.1,可以认为是支持的

所以接下来就可以把 nginx 的超时配置改为如下:

stream {
    upstream rabbitmq{
        server xxxx:5672;
    }
    server {
        listen 5672;
        proxy_connect_timeout 600s;
        #proxy_timeout 3s;
        proxy_pass rabbitmq;
    }   
}

然后再次使用模拟的客户端和接收端进行收发,应该就可以保持长期不断开了

Leave a Reply

Your email address will not be published. Required fields are marked *