Kubernetes,  Operation

新一代高性能 Full Connects VPN 服务 WireGuard 快速部署

新一代高性能 Full Connects VPN 服务 WireGuard 快速部署

1. 简介

  • 官方源码:github.com/WireGuard/wireguard-linux

  • 官方教程:wireguard.com/quickstart/

  • What is WireGuard?

    普通的 Linux 用户到 Linux 创建者 Linus Torvalds,每个人都对 WireGuard 很感兴趣。-- Abhishek Prakash(作者)

    WireGuard 是一个易于部署、高性能且安全的开源 VPN,它利用了最新的加密技术和 UDP 穿洞实现了全连接。目的是提供一种更快、更简单、更精简的通用 VPN,它可以轻松地在树莓派这类低端设备到高端服务器上部署。
    像 IPsec 和 OpenVPN 等大多数其他解决方案是几十年前开发的。安全研究人员和内核开发人员 Jason Donenfeld 意识到它们速度慢且难以正确配置和管理,这让他创建了一个新的开源 VPN 协议和解决方案,它更加快速、安全、易于部署和管理。
    WireGuard 最初是为 Linux 开发的,但现在可用于 Windows、macOS、BSD、iOS 和 Android。它仍在活跃开发中。

  • 当然他并不是完美的,如加密或一些高速分段数据包只适合于大型数据中心,普通用户设备并不能体现出比 IPSec,OpenVPN等更好的优势。比如有人就有着不吹捧的看法:icloudnative.io/posts/why-not-wireguard/

2. 原理

  • 使用简单的虚拟网络设备接口实现,以下描述摘自官方: wireguard.com/#simple-network-interface

    WireGuard works by adding a network interface (or multiple), like eth0 or wlan0, called wg0 (or wg1, wg2, wg3, etc). This network interface can then be configured normally using ifconfig(8) or ip-address(8), with routes for it added and removed using route(8) or ip-route(8), and so on with all the ordinary networking utilities. The specific WireGuard aspects of the interface are configured using the wg(8) tool. This interface acts as a tunnel interface.

  • WireGuard associates tunnel IP addresses with public keys and remote endpoints. When the interface sends a packet to a peer, it does the following:

    • This packet is meant for 192.168.30.8. Which peer is that? Let me look... Okay, it's for peer ABCDEFGH. (Or if it's not for any configured peer, drop the packet.)

    • Encrypt entire IP packet using peer ABCDEFGH's public key.

    • What is the remote endpoint of peer ABCDEFGH? Let me look... Okay, the endpoint is UDP port 53133 on host 216.58.211.110.

    • Send encrypted bytes from step 2 over the Internet to 216.58.211.110:53133 using UDP.

  • When the interface receives a packet, this happens:

    • I just got a packet from UDP port 7361 on host 98.139.183.24. Let's decrypt it!

    • It decrypted and authenticated properly for peer LMNOPQRS. Okay, let's remember that peer LMNOPQRS's most recent Internet endpoint is 98.139.183.24:7361 using UDP.

    • Once decrypted, the plain-text packet is from 192.168.43.89. Is peer LMNOPQRS allowed to be sending us packets as 192.168.43.89?

    • If so, accept the packet on the interface. If not, drop it.

  • Behind the scenes there is much happening to provide proper privacy, authenticity, and perfect forward secrecy, using state-of-the-art cryptography.

  • 总结实现原理: 相当于先通过 DDNS 机制来发现被挡在 NAT 网关后面 LAN 网络的设备在出站数据包经过 NAT 设备的 mapping port 及公网 IP,然后对端 peer 直接发送 UDP 包到此公网 IP 及 mapping port,这样就实现了 Full Connects,并不像传统 IPSec/OpenVPN 等通过中心节点转发,从而支持大量设备高性能接入,因为数据包并不经过 Wg Server。

3. Deploy on Docker (测试推荐)

3.1 首先所有节点需安装 wg

注: 如果是 Linux 则 kernel 5.6 就内置了 wireguard module,当 kernel >= 5.6 则无需安装 wireguard module,只需安装 wireguard tools 即可。 (亲测发现如 ubuntu20.4 是 5.4 但已集成了 wireguard module 到 kernel)

3.2 WG Server 部署 (netmaker server)

#wget https://cdn.jsdelivr.net/gh/gravitl/netmaker@master/compose/docker-compose.yml

mkdir -p /mnt/disk1/netmaker/data
mkdir -p /mnt/disk1/log/netmaker/

cat <<EOF>docker-compose.yml
version: "3.4"
services:
  netmaker:
    container_name: netmaker
    image: gravitl/netmaker:v0.9
    volumes:
      #- /etc/netclient/config:/etc/netclient/config
      - /usr/bin/wg:/usr/bin/wg
      - dnsconfig:/root/config/dnsconfig
      - /mnt/disk1/netmaker/data/:/root/data
    cap_add:
      - NET_ADMIN
    restart: always
    network_mode: host # 可选推荐(如果是在podman运行, 它默认是没有host网络的, 因此要么手动创建, 要么注释掉, 即使用 bridge 网络即可, 但需注意的是若使用 bridge 网络时, 当开启 client_mode=on 时注册的本机节点为 peer 其实是容器里面的接口, 并不是宿主机, 因此外部通过虚拟网络访问这台 peer 时就只能访问到此容器, 反之, 若您没有将 netmaker 节点注册为 peer 的需求则无影响)
    ports:
      - "9909:9909"
      - "50056:50056"
      - "51821-51830:51821-51830/udp"
    environment:
      SERVER_HOST: "SERVER_PUBLIC_IP"
      COREDNS_ADDR: "SERVER_INTRANET_IP"
      GRPC_SSL: "off"
      DNS_MODE: "on"
      CLIENT_MODE: "on" ## 将当前 netmarker server 节点也注册为 peer 节点
      API_PORT: "9909"
      GRPC_PORT: "50056"
      SERVER_GRPC_WIREGUARD: "off"
      CORS_ALLOWED_ORIGIN: "*"
      DATABASE: "sqlite"
      VERBOSITY: 3
  netmaker-ui:
    container_name: netmaker-ui
    depends_on:
      - netmaker
    image: gravitl/netmaker-ui:v0.9
    links:
      - "netmaker:api"
    ports:
      - "30800:80"
    environment:
      BACKEND_URL: "http://SERVER_PUBLIC_IP:9909"
    restart: always
    # network_mode: host
  coredns:
    depends_on:
      - netmaker
    image: coredns/coredns
    command: -conf /root/dnsconfig/Corefile
    container_name: coredns
    restart: always
    network_mode: host
    volumes:
      - dnsconfig:/root/dnsconfig
volumes:
  dnsconfig: {}
EOF

# 替换为您的实际值
export publicIp='your server public ip'
export intranetIp=$(ip route get 1 | sed -n 's/^.*src \([0-9.]*\) .*$/\1/p')
#export topDomain='your server top domain'
export masterKey=$(tr -dc A-Za-z0-9 </dev/urandom | head -c 30 ; echo '')

sed -i "s/SERVER_PUBLIC_IP/$publicIp/g" docker-compose.yml
sed -i "s/SERVER_INTRANET_IP/$intranetIp/g" docker-compose.yml
#sed -i "s/NETMAKER_BASE_DOMAIN/$topDomain/g" docker-compose.yml
#sed -i "s/REPLACE_MASTER_KEY/$masterKey/g" docker-compose.yml

## 若容器引擎是 docker
docker-compose up -d

## 若容器引擎是 containerd
nerdctl compose -f docker-compose.yml up

## 若容器引擎是 podman
#pip3 install podman-compose
podman-compose up --force-recreate -d
  • 注: 其中 network_mode: host 为可选,如果是在podman运行, 它默认是没有 host 网络的, 因此要么手动创建, 要么注释掉, 即使用 bridge 网络即可, 但需注意的是若使用 bridge 网络时,当开启 client_mode=on 时注册的本机节点为 peer 其实是容器里面的接口,并不是宿主机, 因此外部通过虚拟网络访问这台 peer 时就只能访问到此容器,反之若您没有将 netmaker 节点注册为 peer 的需求则无影响。

  • 然后访问 netmaker 控制台:http://<SERVER_PUBLIC_IP>:30800/ , 如果是云主机,请确保已放开安全组规则,30800,9909/tcp(控制台前后端 dashboard/netmaker api port), 50056/tcp(如netclient join时会请求grpc port), 51821-51830/udp(之后接收 peers 心跳续约 udp 端口)

  • 更全面配置参见官方文档: docs.netmaker.org/quick-start.html

3.3 WG Peer Client 部署

  • 这里使用 netclient 来简化 wg 配置
## 注意版本与 netmaker server 对应,否则可能有问题
> curl -OL 'https://github.com/gravitl/netmaker/releases/download/v0.9.0/netclient'

## 再在 netmaker dashboard 配置好网络及 AssessKey,根据提示复制 join 命令,如:
# > sudo ./netclient join -t eyJhcGljb25uc3RyaWxxxxxxxxxxxx
2022/06/18 12:45:24 [netclient] joining office-vpn at 118.31.64.80:50056
2022/06/18 12:45:24 [netclient] node created on remote server...updating configs
2022/06/18 12:45:24 [netclient] retrieving peers
2022/06/18 12:45:24 [netclient] starting wireguard
2022/06/18 12:45:25 [netclient] joined my-office-vpn
  • 查看 wg peers 连接状态
> sudo wg show
interface: nm-my-vpn
  public key: VlccxyMYNkRV6jKFH6NFPQSK1UHfVrWdr6JX79AIgXA=
  private key: (hidden)
  listening port: 56849

peer: 18Ly39cptMQKgrBONMJ6zHh7516oLOqxxxxxxxxx
  endpoint: 118.xx.xx.xx:51821
  allowed ips: 192.168.8.1/32
  latest handshake: 14 seconds ago
  transfer: 3.79 KiB received, 1.25 KiB sent
  persistent keepalive: every 20 seconds

peer: YZpCjADSAtauFmPhxGySpkF5Gxk4/Exxxxxxx
  endpoint: 120.xx.xx.xx:40216
  allowed ips: 192.168.8.2/32
  latest handshake: 2 minutes, 13 seconds ago
  transfer: 6.50 KiB received, 3.92 KiB sent
  persistent keepalive: every 20 seconds
  • 查看 wg peers 配置(可拷贝到 TunSafe / Wireguard Client Win 等客户端进行配置)
> sudo wg showconf nm-my-vpn

[Interface]
ListenPort = 33623
PrivateKey = 0Pl6x0r3NtfGIglISiN3XFlyV7rcRaek28OyzYDeaHo=

[Peer]
PublicKey = LVvCZ4FGPPtTjAqdzhThgGhc4DUmIGSwVe7UR8VxukU=
AllowedIPs = 192.168.8.1/32
Endpoint = 118.xx.xx.xx:51821
PersistentKeepalive = 20

[Peer]
PublicKey = LVqimk87yJwZsu/7dle7rChj9rN1DHMNvluJSwkQeH8=
AllowedIPs = 192.168.8.2/32
Endpoint = 120.xx.xx.xx:40216
PersistentKeepalive = 20
  • 查看本地对应虚拟接口(就会看到一个叫 nm-my-vpn 的虚拟接口,即在 netmaker dashboard 上创建的网络名称)
> ip a
...
32: nm-my-vpn: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1280 qdisc noqueue state UNKNOWN group default qlen 1000
    link/none 
    inet 192.168.8.3/24 scope global nm-my-vpn  # 本机分配的是 192.168.8.3
       valid_lft forever preferred_lft forever
  • 验证虚拟网络
# 在一端内网的 peerA 机器,启动测试服务
> python3 -m http.server

# 然后在另一端内网的 peerB 机器请求此服务(其中 peerA 和 peerB 分别为相隔十万八千里的两个内网机器)
> curl -v 192.168.8.2:8000  # 对应 UDP 的打洞: 120.xx.xx.xx:40216 公网地址
  • 其实上面 netclient 原理就等价于几条设置虚拟接口的命令,具体可以从 WireGuard 官网给出的 demo 脚本中看出: wireguard.com/quickstart/ 所: git.zx2c4.com/wireguard-tools/plain/contrib/ncat-client-server/client.sh ,此地址可能被墙,为方便查看直接贴出内容如下:

    #!/bin/bash
    set -ex
    [[ $UID == 0 ]] || { echo "You must be root to run this."; exit 1; }
    exec 3<>/dev/tcp/demo.wireguard.com/42912
    privatekey="$(wg genkey)"  # 首先使用 wg tools 生成私钥
    wg pubkey <<<"$privatekey" >&3 # 根据私钥生成公钥
    IFS=: read -r status server_pubkey server_port internal_ip <&3
    [[ $status == OK ]]
    ip link del dev wg0 2>/dev/null || true
    ip link add dev wg0 type wireguard ## 添加虚拟网络接口 wg0
    wg set wg0 private-key <(echo "$privatekey") peer "$server_pubkey" allowed-ips 0.0.0.0/0 endpoint "demo.wireguard.com:$server_port" persistent-keepalive 25  ## 配置虚拟网络接口 wg0 的密钥及 wg server 的endpoint 地址等.
    ip addr add "$internal_ip"/24 dev wg0 ## 给虚拟接口 wg0 分配虚拟网段
    ip link set up dev wg0 ## 启动虚拟接口 wg0
    if [ "$1" == "default-route" ]; then
    host="$(wg show wg0 endpoints | sed -n 's/.*\t\(.*\):.*/\1/p')"
    ip route add $(ip route get $host | sed '/ via [0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}/{s/^\(.* via [0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\).*/\1/}' | head -n 1) 2>/dev/null || true
    ip route add 0/1 dev wg0 ## 设置路由(引导需要连接此虚拟网络的流量)
    ip route add 128/1 dev wg0
    fi
  • 由官方 download.wireguard.com/windows-client/ 提供的 Windows 客户端(傻瓜式安装)。

  • TunSafe download 提供的 Windows 客户端(傻瓜式安装)。

4. Deploy on Kubernetes (生产推荐)

5. FAQ

5.1 使用 nerdctl run 启动 netmaker 容器时报错: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: time="2022-06-09T10:54:34+08:00" level=fatal msg="failed to call cni.Setup: plugin type="bridge" failed (add): incompatible CNI versions; config is "1.0.0", plugin supports ["0.1.0" "0.2.0" "0.3.0" "0.3.1" "0.4.0"]"

  • 原因是因为 cni 配置的版本不被 cni 支持,请检查: cat /etc/cni/net.d/nerdctl-wireguard_default.conflist 是否配置的 1.0.0,可尝试改为 0.4.0

5.2 关于 WireGuard 在 Windows 上配置可能出现的错误及解决方案

  • 执行 netclient.exe 时可能碰到的错误(注:应该在 PowerShell 中执行,否则在普通 cmd 运行可能报错找不到 wg.exe)
D:\download> .\netclient.exe join -t xxxxxxxxxxxx
2022/06/19 14:30:40 Gravitl Netclient on Windows started
2022/06/19 14:30:40 [netclient] joining sc-devvpn at 120.xx.xx.xx:50056
2022/06/19 14:30:41 [netclient] node created on remote server...updating configs
2022/06/19 14:30:41 [netclient] retrieving peers
2022/06/19 14:30:41 [netclient] starting wireguard
2022/06/19 14:30:41 [netclient] writing wg conf file to: C:\ProgramData\Netclient\nm-sc-devvpn.conf
2022/06/19 14:30:41 [netclient] waiting for interface...
2022/06/19 14:30:51 [netclient] error installing: could not create wg interface for nm-sc-devvpn
2022/06/19 14:30:51 [netclient] removed machine from sc-devvpn network on remote server
2022/06/19 14:30:51 error running command: wireguard.exe /uninstalltunnelservice nm-sc-devvpn
2022/06/19 14:30:51
2022/06/19 14:30:51 [netclient] unable to wipe local config
2022/06/19 14:30:51 [netclient] used backup file for network: sc-devvpn
2022/06/19 14:30:51 [netclient] error removing artifacts: open C:\ProgramData\Netclient\netconfig-sc-devvpn: The system cannot find the file specified.
2022/06/19 14:30:51 [netclient] error removing services: open C:\ProgramData\Netclient\netconfig-sc-devvpn: The system cannot find the file specified.
2022/06/19 14:30:51 open C:\ProgramData\Netclient\netconfig-sc-devvpn: The system cannot find the file specified.
PS D:\download>
D:\download> .\netclient.exe join -t xxxxxxxxxxxx
2022/06/19 14:26:19 Gravitl Netclient on Windows started
2022/06/19 14:26:19 [netclient] joining sc-devvpn at 120.xx.xx.xx:50056
2022/06/19 14:26:20 [netclient] node created on remote server...updating configs
2022/06/19 14:26:20 [netclient] retrieving peers
2022/06/19 14:26:20 [netclient] starting wireguard
2022/06/19 14:26:20 [netclient] writing wg conf file to: C:\ProgramData\Netclient\nm-sc-devvpn.conf
2022/06/19 14:26:20 [netclient] error writing wg conf file to C:\Program Files\WireGuard\Data\Configurations\nm-sc-devvpn.conf: open C:\Program Files\WireGuard\Data\Configurations\nm-sc-devvpn.conf: The system cannot find the path specified.
2022/06/19 14:26:20 [netclient] error installing: open C:\Program Files\WireGuard\Data\Configurations\nm-sc-devvpn.conf: The system cannot find the path specified.
2022/06/19 14:26:20 [netclient] removed machine from sc-devvpn network on remote server
2022/06/19 14:26:20 error running command: wireguard.exe /uninstalltunnelservice nm-sc-devvpn
2022/06/19 14:26:20
2022/06/19 14:26:20 [netclient] unable to wipe local config
2022/06/19 14:26:20 [netclient] used backup file for network: sc-devvpn
2022/06/19 14:26:20 [netclient] error removing artifacts: open C:\ProgramData\Netclient\netconfig-sc-devvpn: The system cannot find the file specified.
2022/06/19 14:26:20 [netclient] error removing services: open C:\ProgramData\Netclient\netcon
  • windows下netclient join会依赖 C:\Program Files\WireGuard\wg.exe 和 C:\Program Files\WireGuard\wireguard.exe
    源码分析:https://github1s.com/gravitl/netmaker/blob/master/netclient/ncutils/netclientutils.go#L49-L50

  • 但亲测发现,虽然会调用 wg.exe和wireguard.exe,但是不能且不需要显示启动 wireguard.exe,否则会产生冲突,且会执行 wireguard.exe /installtunnelservice "%s" 命令安装wireguard为系统服务,参考源码:
    https://github1s.com/gravitl/netmaker/blob/master/netclient/wireguard/windows.go#L13-L14

  • 总结: 通常应该安装 wireguard-amd64-0.5.3.msi 和 netclient.exe组合使用,本质上是 netclient.exe命令行调用 wireguard.exe等,但切记不可同时启动否则会冲突导致无法全连接到peers。(netclient的作用是可以动态发现,反之如果仅配置wireguard.exe则不能动态发现每个 peers 的 NAT 映射端口配置信息。

如果使用 wireguard-amd64.msi 和 netclient.exe,则无需安装 TunSafe-Tap,另外关于使用 TunSafe作为Windows客户端的方案,目前未找到修改连接到 TunSafe server 的入口,因为墙内连接不上,因此不考虑。

  • 如果提升 joined 成功了,但依然连接不上,可尝试重新执行多尝试几次即可
## 1.先停止 peers 连接
./netclient.exe leave -n sc-devvpn

## 2.卸载 netclient
./netclient.exe uninstall

## 3.重新加入 peers 组
./netclient.exe join -t xxxxxx

5.3 当 UDP 穿洞 NAT 被运营商干扰时如何解决?

留言

您的电子邮箱地址不会被公开。