EMQx 生产集群部署
EMQx 生产集群部署
注:老版本 2.x 名称叫 emqtt,新版 4.x 叫 emqx,主配置变化不大,本文配置中就统一使用 emq 作为名称。
1. 基于 Host 部署集群
1.1 下载安装
# 注: 对应系统版本二进制包
# [https://www.emqx.com/zh/downloads?product=broker](https://www.emqx.com/zh/downloads?product=broker)
sudo mkdir -p /opt/apps; sudo cd /opt/apps
sudo curl -OL https://www.emqx.com/zh/downloads/broker/2.3.11/emqttd-centos7-v2.3.11.zip
sudo unzip emqttd-*.zip
sudo ln -snf /opt/apps/emqttd-* /usr/lib/emq-current
1.2 基础配置
- 配置外部日志、数据目录
sudo mkdir -p /mnt/disk1/emq
sudo cp -r /usr/lib/emq-current/data/* /mnt/disk1/emq/ # 【重要】Erlang OTP 数据库文件,否则可能起不来
- 将
RUNNER_LOG_DIR=$RUNNER_ROOT_DIR/logs
改为RUNNER_LOG_DIR=/mnt/disk1/log/emq
- 将
RUNNER_DATA_DIR=$RUNNER_ROOT_DIR/data
改为RUNNER_DATA_DIR=/mnt/disk1/emq/data
1.3 运行配置
- 配置集群策略、节点、日志目录等,
$EMQ_HOME/etc/emq.conf
,以节点1为例: EMQ FQDN 节点
## 集群相关配置
cluster.name = emqcl
## Default: manual
cluster.discovery = static # 【重要】默认 mamual 模式
## Node list of the cluster.
cluster.static.seeds = emq@10.111.0.111,emq@10.111.0.112 # 这里简单简单实用IP的方式,也可使用FQDN
## Default: emqx@127.0.0.1
node.name = emq@10.111.0.111 #【重要】
## 日志相关配置
## Crash dump log file.
node.crash_dump = /mnt/disk1/log/emq/crash.dump
## Sets the log dir.
log.dir = /mnt/disk1/log/emq
## The file where error logs will be writed to.
log.error.file = /mnt/disk1/log/emq/error.log
## The file for crash log.
log.crash.file = /mnt/disk1/log/emq/crash.log
1.4 服务配置
[展开] Configure emq.service
sudo cat <<-'EOF' >/etc/systemd/system/emq.service #!/bin/bash # Copyright (c) 2017 ~ 2025, the original author wangl.sir individual Inc, # All rights reserved. Contact us wanglsir<wangl@gmail.com, 983708408@qq.com> # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # # @see: https://docs.emqx.cn/broker/v2.0/ # @see: https://blogs.wl4g.com/archives/705 [Unit] Description=EMQ Server After=network.target [Install] WantedBy=multi-user.target [Service] Type=forking User=root Group=root ExecStart=/usr/lib/emq-current/bin/emqttd start ExecReload=/bin/kill -s HUP StandardOutput=journal StandardError=journal # Cannot set too large to exceed the maximum system limit, otherwise an error will be reported: 'Failed at # step LIMITS spawning: Operation not permitted'. e.g: /etc/security/limits.conf LimitNOFILE=65535 LimitNPROC=65535 LimitCORE=infinity TimeoutStartSec=10 TimeoutSec=600 Restart=always PermissionsStartOnly=true RuntimeDirectoryMode=755 PrivateTmp=false EOF systemctl daemon-reload systemctl enable emq systemctl restart emq systemctl status emq journalctl -afu emq
1.6 重启集群
- 拷贝到各个节点,然后依次启动。
ssh collect-node1 "sudo mkdir -p /mnt/disk1/"
scp -r /mnt/disk1/emq/ collect-node2:/mnt/disk1/
scp -r /opt/apps/emqtt* collect-node2:/opt
ssh collect-node1 "sudo ln -snf /opt/apps/emqtt* /usr/lib/emqtt-current"
1.7 检查成功
ss -nlpt | grep beam
LISTEN 0 1024 0.0.0.0:1883 0.0.0.0:* users:(("beam.smp",pid=197060,fd=27))
LISTEN 0 1024 127.0.0.1:18083 0.0.0.0:* users:(("beam.smp",pid=197060,fd=32))
LISTEN 0 512 127.0.0.1:11883 0.0.0.0:* users:(("beam.smp",pid=197060,fd=26))
LISTEN 0 512 127.0.0.1:8081 0.0.0.0:* users:(("beam.smp",pid=197060,fd=31))
LISTEN 0 128 0.0.0.0:4370 0.0.0.0:* users:(("beam.smp",pid=197060,fd=17))
LISTEN 0 1024 0.0.0.0:8883 0.0.0.0:* users:(("beam.smp",pid=197060,fd=29))
LISTEN 0 1024 0.0.0.0:8083 0.0.0.0:* users:(("beam.smp",pid=197060,fd=28))
LISTEN 0 1024 0.0.0.0:8084 0.0.0.0:* users:(("beam.smp",pid=197060,fd=30))
LISTEN 0 5 0.0.0.0:5369 0.0.0.0:* users:(("beam.smp",pid=197060,fd=22))
- 浏览器访问: http://127.0.0.1:18083/#/ ,默认账号/密码:admin/public
- 若在1.6.1中未启动成功,则将
$EMQ_HOME/data
目录拷贝至/mnt/disk1/emq/data
(因为在此data目录中有一些emq启动必须加装项配置)
2. 基于 docker 部署集群
- 同 #1.基于Host部署集群 选其一即可。
2.1 手动配置
-
注:目前在
emqx/emqx:v4.3.10
版本的镜像中测试直接启动容器,然后docker logs -f emqx1
查看日志发现会报错Failed to read: "data/loaded_modules", error: enoent
,看来它不会自动创建配置文件。如果不想麻烦,直接去掉-v /mnt/disk1/emqx/data/:/opt/emqx/data/
也不会有问题了。 -
参考: docs.emqx.com/zh/emqx/v4.3/tutorial/prometheus.html#prometheus-监控告警
sudo mkdir -p /mnt/disk1/emqx/{etc/plugins,data}
sudo chmod -R 777 /mnt/disk1/emqx
# 配置默认加载的模块
cat <<-'EOF' >/mnt/disk1/emqx/data/loaded_modules
{emqx_mod_acl_internal, true}.
{emqx_mod_topic_metrics, true}.
{emqx_mod_presence, true}.
{emqx_mod_delayed, false}.
{emqx_mod_rewrite, false}.
{emqx_mod_subscription, false}.
EOF
# 配置默认开启的插件
cat <<-'EOF' >/mnt/disk1/emqx/data/loaded_plugins
{emqx_management, true}.
{emqx_dashboard, true}.
{emqx_recon, true}.
{emqx_retainer, true}.
{emqx_telemetry, true}.
{emqx_rule_engine, true}.
{emqx_prometheus,true}.
{emqx_modules, false}.
{emqx_bridge_mqtt, false}.
EOF
# 如上 emqx_prometheus 插件开启后会加载此配置,将 metrics 自动推送到 pushgateway
cat <<-'EOF' >/mnt/disk1/emqx/etc/plugins/emqx_statsd.conf
statsd.push.gateway.server=http://127.0.0.1:9091
statsd.interval=15000
EOF
sudo chmod -R 777 /mnt/disk1/emqx
2.2 启动容器
docker run -itd \
--network=host \
--restart=always \
--name=emqx1 \
-e EMQX_NAME=emqx \
-e EMQX_HOST=n1.emqx.io \
-e EMQX_CLUSTER__DISCOVERY=static \
-e EMQX_CLUSTER__STATIC__SEEDS=emqx@n1.emqx.io,emqx@n2.emqx.io \
-e TZ=Europe/Amsterdam \
-v /mnt/disk1/emqx/data/:/opt/emqx/data/ \
-v /mnt/disk1/emqx/etc/plugins/emqx_statsd.conf:/opt/emqx/etc/plugins/emqx_statsd.conf \
emqx/emqx:4.3.10
- 注1:以上是
static
方式集群发现(为简化直接使用了 host 模式),其中n1.emqx.io
是宿主机名,必须一定是 FQDN 命名规范且网络互通,或使用 IP 也可以,否则集群发现会失败。
2.3 检查启动
- 默认启动的插件会监听以下端口,如发现某些端口未监听,请查看
docker logs -f --tail 99 emqx1
日志分析。
ss -nlpt | grep beam
LISTEN 0 1024 0.0.0.0:1883 0.0.0.0:* users:(("beam.smp",pid=197060,fd=27))
LISTEN 0 1024 127.0.0.1:18083 0.0.0.0:* users:(("beam.smp",pid=197060,fd=32))
LISTEN 0 512 127.0.0.1:11883 0.0.0.0:* users:(("beam.smp",pid=197060,fd=26))
LISTEN 0 512 127.0.0.1:8081 0.0.0.0:* users:(("beam.smp",pid=197060,fd=31))
LISTEN 0 128 0.0.0.0:4370 0.0.0.0:* users:(("beam.smp",pid=197060,fd=17))
LISTEN 0 1024 0.0.0.0:8883 0.0.0.0:* users:(("beam.smp",pid=197060,fd=29))
LISTEN 0 1024 0.0.0.0:8083 0.0.0.0:* users:(("beam.smp",pid=197060,fd=28))
LISTEN 0 1024 0.0.0.0:8084 0.0.0.0:* users:(("beam.smp",pid=197060,fd=30))
LISTEN 0 5 0.0.0.0:5369 0.0.0.0:* users:(("beam.smp",pid=197060,fd=22))
- 浏览器访问: http://127.0.0.1:18083/#/ ,默认账号/密码:admin/public
3. 启用客户端认证
即传感器、物联网终端设备。
3.1 新增账号
vim $EMQ_HOME/etc/plugins/emq_auth_username.conf
auth.user.1.username = admin
auth.user.1.password = public
auth.user.2.username = test
auth.user.2.password = test
3.2 禁止匿名(安全推荐)
sed -i 's/mqtt.allow_anonymous = true/mqtt.allow_anonymous = false/g' $EMQ_HOME/etc/emq.conf
- 使用命令行加载
emq_auth_username
插件,也可在 Dashdoard -> pugins 启用加载,此插件对应 配置文件$EMQ_HOME/etc/plugins/emq_auth_username.conf。
$EMQ_HOME/bin/emqx_ctl plugins load emq_auth_username
4. 部署 prometheus pushgateway
- 由于官方提供的 emqx-prometheus 插件只支持将指标推送到 pushgateway,参考docs.emqx.com/zh/emqx/v4.3/tutorial/prometheus.html ,因此必须先部署它。
4.1 基于 Host 编译部署
- 4.1.1 源码编译
cd /tmp
git clone https://github.com/prometheus/pushgateway.git
git checkout v1.4.2
make
# 拷贝到目标主机 bin 目录
scp pushgateway n1.emqx.io:/bin/
- 4.1.2 运行配置
mkdir -p /etc/sysconfig/
sudo curl -o /etc/sysconfig/pushgateway.conf 'https://raw.githubusercontent.com/wl4g/prometheus-integration/master/prometheus/pushgateway/sysconfig/pushgateway.conf'
- 4.1.3 配置服务
sudo curl -o /etc/systemd/system/pushgateway.service 'https://raw.githubusercontent.com/wl4g/prometheus-integration/master/prometheus/pushgateway/systemd/pushgateway.service'
systemctl daemon-reload
systemctl enable pushgateway
systemctl restart pushgateway
systemctl status pushgateway
journalctl -afu pushgateway
- 4.1.4 测试验证
curl localhost:9091/metrics
4.2 基于 docker 部署
docker run -d --name pushgateway1 --restart always -p 9091:9091 prom/pushgateway:v1.4.2
4.3 导入 Grafana 仪表盘
5. FAQ
5.1 启动 emqx 容器时指定了环境变量TZ
不生效,还需手动进入容器里改。
docker exec -it emqx1 bash
sudo apk add --no-cache tzdata
sudo ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
sudo echo "Asia/Shanghai" > /etc/timezone