Database

Redis Cluster 生产环境离线部署

Redis Cluster 生产环境离线部署

1. 从源码部署

1.1 环境变量

  • /etc/profile.d/profile-redis.sh
sudo curl -L -o /etc/profile.d/profile-redis.sh https://gitee.com/wl4g/blogs/raw/master/docs/articles/database/redis-cluster-deploy/resources/redisctl

# 使生效
. /etc/profile

# 创建用户
sudo useradd $REDIS_USER -U

1.2 编译安装

# 建议使用5.0+ 版本无需ruby依赖配置起来省事
cd /tmp
sudo wget -O redis.tar.gz https://github.com/antirez/redis/archive/5.0.7.tar.gz
sudo tar -xf redis.tar.gz
cd redis-5.0.7 && make
# 拷贝包到安装目录
cp /tmp/redis-5.0.7/src/redis-cli ${REDIS_BIN_DIR} 
  && cp /tmp/redis-5.0.7/src/redis-server ${REDIS_BIN_DIR} 
  && cp /tmp/redis-5.0.7/src/redis-benchmark ${REDIS_BIN_DIR} 
  && cp /tmp/redis-5.0.7/src/redis-check-aof ${REDIS_BIN_DIR} 
  && cp /tmp/redis-5.0.7/src/redis-check-rdb ${REDIS_BIN_DIR} 
  && cp /tmp/redis-5.0.7/src/redis-sentinel ${REDIS_BIN_DIR}

rm -rf /tmp/redis-5.0.7
rm -rf /tmp/redis.tar.gz

1.3 运行配置

  • /tmp/redis-config.tpl
# 配置模版
sudo curl -L -o /tmp/redis-config.tpl https://gitee.com/wl4g/blogs/raw/master/docs/articles/database/redis-cluster-deploy/resources/redis-config.tpl

# 生成所有节点的配置文件
export redisPorts=('6379' '6380' '6381' '7379' '7380' '7381')
for port in ${redisPorts[@]}; do sudo cat /tmp/redis-config.tpl | sed "s/PORT/$port/g" > "$REDIS_HOME/conf/redis-$port.conf"; done

1.4 管理脚本

sudo curl -L -o /usr/bin/redisctl https://gitee.com/wl4g/blogs/raw/master/docs/articles/database/redis-cluster-deploy/resources/redisctl

sudo chmod +x /usr/bin/redisctl
redisctl --version

1.5 集群初始化

  • 注: 请实际情况修改 IP,由于集群选举和客户端连接 slot 都会使用此地址,因此建议不要使用 127.0.0.1,否则客户端建立连接时获取到的所有slot节点的地址是127.0.0.1 会连接超时.
# 首先启动所有节点
redisctl restart

# 执行集群初始化(自动获取网卡IP,可按实际修改)
# 注:此处的初始化的 IP 将作为客户端连接每个 solt 的地址,如此时初始化是用的 127.0.0.1,
# 那么在其他机器客户端连接 slot 时也会使用127.0.0.1 导致 Connection refused
export localIp=$(ifconfig|grep -A 4 -E '^eno*|^enp*|^ens*|^eth*|^wlp*'|grep 'inet'|awk '{print $2}'|head -1 2>/dev/null)
redisctl cluster 123456 $localIp:6379,$localIp:6380,$localIp:6381,$localIp:7379,$localIp:7380,$localIp:7381
  • 出现如下日志则表示初始化成功
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 172.21.168.240:7380 to 172.21.168.240:6379
Adding replica 172.21.168.240:7381 to 172.21.168.240:6380
Adding replica 172.21.168.240:7379 to 172.21.168.240:6381
>>> Trying to optimize slaves allocation for anti-affinity
[WARNING] Some slaves are in the same host as their master
M: 3329717f8b69d4c9d773ae58d7101880082c1053 172.21.168.240:6379
   slots:[0-5460] (5461 slots) master
M: b1762f5c157d18ab617d9b64e0c9287542a1ac99 172.21.168.240:6380
   slots:[5461-10922] (5462 slots) master
M: ad92a5c35edab196d03091c7ff72b245c797b35c 172.21.168.240:6381
   slots:[10923-16383] (5461 slots) master
S: 0af4e11133c62cf242e9ddb7460a07dac4a2b9cd 172.21.168.240:7379
   replicates 3329717f8b69d4c9d773ae58d7101880082c1053
S: 3aab9b0b790a4758564f9a5c7491f738a84a893f 172.21.168.240:7380
   replicates b1762f5c157d18ab617d9b64e0c9287542a1ac99
S: 806cbb1e1dd388857196f6903c78b3b75f090644 172.21.168.240:7381
   replicates ad92a5c35edab196d03091c7ff72b245c797b35c
Can I set the above configuration? (type 'yes' to accept): >>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
.......

>>> Performing Cluster Check (using node 172.21.168.240:6379)
......
M: b1762f5c157d18ab617d9b64e0c9287542a1ac99 172.21.168.240:6380
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

1.6 内核优化

# 关闭透明大页
# WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
sudo echo never > /sys/kernel/mm/transparent_hugepage/enabled
sudo "echo never > /sys/kernel/mm/transparent_hugepage/enabled" >> /etc/rc.local # 永久生效

sudo cat <<-'EOF' >/etc/sysctl.d/10-redis.conf
# WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
vm.overcommit_memory = 1
net.core.somaxconn = 2048
EOF

sudo sysctl -p
sysctl -p

1.7 服务配置

  • 先停止所有节点,改用 systemd 拉起来,即确保 OS 宕机开启自启.
redisctl stop
sudo curl -L -o /etc/systemd/system/redis-cluster.service https://gitee.com/wl4g/blogs/raw/master/docs/articles/database/redis-cluster-deploy/resources/redis-cluster.service

# 这里统一做目录授权
sudo chown -R $REDIS_USER:$REDIS_GROUP $REDIS_HOME
sudo chmod -R 755 $REDIS_HOME
sudo chown -R $REDIS_USER:$REDIS_GROUP $REDIS_DATA_DIR
sudo chmod -R 744 $REDIS_DATA_DIR
sudo chown -R $REDIS_USER:$REDIS_GROUP $REDIS_LOG_DIR
sudo chmod -R 755 $REDIS_LOG_DIR
sudo chmod +x /usr/bin/redisctl

sudo systemctl daemon-reload
sudo systemctl enable redis-cluster
sudo systemctl restart redis-cluster
sudo systemctl status redis-cluster

运行截图

2. FAQ

  • 2.1 Redis 启动日志提示需要优化内核参数

14717:M 17 May 2021 09:36:52.534 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
14717:M 17 May 2021 09:36:52.534 # Server initialized
14717:M 17 May 2021 09:36:52.534 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
14717:M 17 May 2021 09:36:52.534 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.

  • 解决
echo 512 > /proc/sys/net/core/somaxconn
echo 'net.core.somaxconn = 2048' >> /etc/sysctl.conf
echo 'vm.overcommit_memory = 1' >> /etc/sysctl.conf
sysctl -p
sudo echo never > /sys/kernel/mm/transparent_hugepage/enabled
sudo echo 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' >> /etc/rc.local
  • 2.2 初始化集群时报错:

Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. [ERR] Node 127.0.0.1:6379 is not empty. Either the node already knows other nodes (check with CLUSTER NODES) or contains some key in database 0.

  • 解决

是因为已存在集群 node 文件,之前创建过,除非删除才能重新初始化.

rm -rf $REDIS_NODE_DIR/*

留言

您的电子邮箱地址不会被公开。