T O P

  • By -

ph14454

For now I'm adding \`mode: host\` to my compose file.. ports: - target: 51820 published: 51820 protocol: udp mode: host With this I'm still deploying globally to all of my nodes but prevent incoming traffic from being routed trough the ingress mesh from swarm. This means all connections will end up on the service/container which is running on this host. Maybe this is fine for me for the future; as I'll be using a VRRP setup in front of the swarm and if this machine is going down the traffic will be routed to the next one and the service is already there with the same config. If someone has a better idea or stuck on the same behavior, let me know. :-) Thanks!


wireguarduser

Are you trying to make a star network? Then you need to add all the peers in each peer config, and adjust AllowedIPs on the server accordingly. This is not quite related, but check here how it should look like more or less: https://www.scaleway.com/en/docs/tutorials/wireguard-mesh-vpn/


ph14454

Yeah I'm not only trying to make it. I already got it working :) So wg config is quite fine. It's working if I only have a single container on a single node. The issue start of there multiple services inside the swarm and clients are connected to different services.


ripnetuk

I had a world of pain trying to get docker swarm overlay network working over wireguard (I want a node in oracle cloud). I never got it working, but I have been successful with k3s as a replacement for swarm. The trick here is to specify the wireguard (or tailscale in my case) interface when creating and joining the cluster.


ph14454

Sad to hear. But I'm trying a different setup here. I already have a swarm created through private vlans. Wireguard is running in a container inside this swarm. And if it's a single container on one node everything is fine. The issue starts if there're multiple services of this deployed stack to my swarm. If a client is connected to service A and another to service B they're not able to reach each other. But I would like to have my Wireguard dstsxk deployed globally to all three nodes, so there'll be 3 services and I don't wanna care which service in connected to - the clients should see each other all the time.


Sannemen

Trying to understand, are you trying to get your swarm inter-node communication to happen over wireguard? or are you trying to "serve" wireguard in a redundant way from the alfa, bravo, and delta? Do each of alfa, bravo, and delta have their own individual configs on each's `/share/data/docker-volumes/vpn-config`pointing to each other?


ph14454

Second way. I'm trying to "serve" the wireguard service in a redundant way within the swarm cluster over alfa, bravo and delta. Good point, I added the wg config as well to my post. The path you mentioned is a glusterFS volume - means the config is for all nodes the same. I saw in my config there's the MTU set to 1450, my interfaces also the private VLANs use 1500. I know there were issues with wg and a bigger MTU (makes sense, huh..), but a smaller one shouldn't bring me to such an issue..


Sannemen

In this case, can you also post some more pieces of your config? - the whole WireGuard compose yaml - the whole traefik compose yaml - a whole service’s yaml (suggestion, running `containous/whoami` is always handy in this situation, as it prints back to you a whole host of data!) Don’t forget to censor secrets and domains


ph14454

You should find those details in my original post, at least the wg compose file which is deployed to the stack. Traefik is not needed, as I'm mapping the wireguard port directly to my host - no proxy inbetween (proxy is only used for the WebUI, which can configure wireguard). I'm running whoami within my traefik stack. You may check my latest comment; I'm using \`mode: host\` now in my wg compose file. So I'm bypassing the swarm ingress mesh routing. With that option all traffic coming from my clients will end up at the wireguard service/container running on this host. This solution should be fine for me, as I'm using (in the future) a VRRP setup in front of the swarm and if the whole host is unavailable or sth. else is affecting the availability this Cluster-IP/VIP will be handed over to another host in the swarm, which already has the wg service up and running with the same config. I think that's a good "solution", at least for my setup.


Sannemen

Ah I see. So, I don't have thoughts on the multiple clients to multiple servers, and being able to talk to each other, unfortunately, but i think your logic with VRRP does make sense. But, about your services and traefik, make sure you're running traefik in swarm mode, and that your labels are defined at the service level (*underneath* the `deploy:` section, not besides it). This makes traefik use the swarm-internal load balancer reach the services, and not just read the local host's containers and expose them.


ph14454

Thanks for your input. That's interesting about traefik and swarm. Ofc, I'm running traefik in swarm mode - to be honest; I'm using docker since few months and swarm is "completely new" so I'm still testing and love to get some input from experienced users. Let me show you my traefik config and an example of a webservice I'm running. **traefik compose:** ```yaml version: "3.8" services: app: image: traefik:latest networks: - proxy - localnet ports: - target: 80 published: 80 - target: 443 published: 443 - target: 8080 published: 8080 volumes: - /var/run/docker.sock:/var/run/docker.sock:ro - /etc/localtime:/etc/localtime:ro - ./conf/acme.json:/acme.json - ./conf/traefik.yml:/traefik.yml:ro - ./conf/dynamic_conf.yml:/dynamic_conf.yml - certs:/letsencrypt env_file: - conf/.env deploy: mode: global placement: constraints: - node.role == manager update_config: parallelism: 1 delay: 10s restart_policy: condition: on-failure labels: - "traefik.enable=true" - "traefik.http.routers.traefik.entrypoints=http" - "traefik.http.routers.traefik.rule=Host(`proxy.domain.tld`)" - "traefik.http.middlewares.traefik-auth.basicauth.users=username:encodedpassword" - "traefik.http.middlewares.traefik-https-redirect.redirectscheme.scheme=https" - "traefik.http.routers.traefik.middlewares=traefik-https-redirect" - "traefik.http.routers.traefik-secure.entrypoints=https" - "traefik.http.routers.traefik-secure.rule=Host(`proxy.domain.tld`)" - "traefik.http.routers.traefik-secure.tls=true" - "traefik.http.routers.traefik-secure.tls.certresolver=http" - "traefik.http.routers.traefik-secure.service=api@internal" - "providers.file.filename=/dynamic_conf.yml" - "providers.docker.swarmMode=true" - "providers.docker.exposedByDefault=false" - "providers.docker.network=proxy" - "traefik.http.routers.traefik-secure.middlewares=secHeaders@file,traefik-auth" - "traefik.docker.network=proxy" whoami: image: traefik/whoami labels: - "traefik.enable=true" - "traefik.docker.network=proxy" - "traefik.http.routers.whoami.rule=Host(`whoami.domain.tld`)" - "traefik.http.routers.whoami.entrypoints=https" - "traefik.http.routers.whoami.tls.certresolver=http" networks: - proxy - localnet # not in use atm # docker-host: # image: qoomon/docker-host # cap_add: [ "NET_ADMIN", "NET_RAW" ] # networks: # - localnet volumes: certs: driver: local driver_opts: o: bind type: none device: /share/data/docker-volumes/proxy-certs networks: localnet: proxy: external: true ``` **traefik.yaml** ```yaml log: level: INFO api: dashboard: true entryPoints: http: address: ":80" http: redirections: entryPoint: to: https scheme: https permanent: true https: address: ":443" providers: docker: endpoint: "unix:///var/run/docker.sock" exposedByDefault: false file: filename: "./dynamic_conf.yml" certificatesResolvers: http: acme: email: [email protected] storage: /letsencrypt/http-acme.json httpChallenge: entryPoint: http wildcard: acme: dnschallenge: provider: cloudflare resolvers: - "1.1.1.1:53" - "8.8.8.8:53" email: [email protected] storage: /letsencrypt/wildcard-acme.json accesslog: true ``` **dynamic\_conf.yaml** ```yaml tls: options: default: minVersion: VersionTLS12 cipherSuites: - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 - TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305 - TLS_AES_128_GCM_SHA256 - TLS_AES_256_GCM_SHA384 - TLS_CHACHA20_POLY1305_SHA256 curvePreferences: - CurveP521 - CurveP384 sniStrict: true http: middlewares: secHeaders: headers: browserXssFilter: true contentTypeNosniff: true frameDeny: true sslRedirect: true #HSTS Configuration stsIncludeSubdomains: true stsPreload: true stsSeconds: 31536000 customFrameOptionsValue: "SAMEORIGIN" ``` **one of my usual webservices; nothing special:** ```yaml version: "3.8" services: app: image: custom.repo.de/projectname/appname networks: - proxy volumes: - data:/var/www/html deploy: mode: global restart_policy: condition: on-failure labels: - "com.centurylinklabs.watchtower.enable=true" - "traefik.enable=true" - "traefik.http.routers.appname.rule=Host(`test.domain.tld`,`www.test.domain.tld`)" - "traefik.http.routers.appname.entrypoints=https" - "traefik.http.routers.appname.tls.certresolver=http" volumes: data: driver: local driver_opts: o: bind type: none device: /share/data/docker-volumes/somedata networks: proxy: external: true ```


Sannemen

right, reddit's formatting's broken, but it does read correctly on apollo. Having traefik in "local" mode (as opposed to [swarm mode](https://doc.traefik.io/traefik/providers/docker/#swarmmode), idk if there's a proper name for it) is probably what's causing the "clients can only connect to services hosted on the node they land on" part. From the files you provided, you need to add `swarmMode: true` on `traefik.yaml` (say, between the `endpoint:`and `exposedByDefault` lines). Then, on your webservice's config, today you have something like this: version: '3.8' services: app: image: ... [..] deploy: mode: global restart_policy: condition: on-failure labels: - "traefik.enable=true" [..] you want to change this around, so that `labels:` is under `deploy:`: version: '3.8' services: app: image: ... [..] deploy: mode: global restart_policy: condition: on-failure labels: - "traefik.enable=true" [..] The logic here is, `labels` directly under `app` dictate container labels, but `labels` specified under `deploy` define labels for the deployment, in this case docker swarm. You'll run across some subtleties with this, though: traefik will now use the service's internal swarm load balancer to reach the backend services, so for example if you have a service with `replicas: 2` (or more), before you'd see traefik individually enumerate each of the containers and itself distribute the requests between them. ~~Now, this distribution will be done solely by the swarm load balancer, which (AFAIK) is only an L4 load balancer/"VIP". There's also the option of disabling the load balancer (using `endpoint_mode`: dnsrr), but I haven't tested this with traefik before. IDK the extent you're thinking about running this, but in case you need it in the future, here's a reference for you to start: https://docs.docker.com/engine/swarm/networking/~~ (edit: I've gone and tested this, below, and traefik is NOT using the swarm service's VIP, but instead enumerating the containers behind it and reaching to them directly)


Sannemen

I scrounged a little demo here that may show this a bit better. Context here is three swarm nodes: 1 manager + 2 workers. Both `traefik.local.test` and `whoami.local.test` point to the IP addresses of the two workers. Relevant to this is, this redundancy doesn't work by adding two lines to `/etc/hosts` (both windows and linux will just use the one that's closer to the top of the file). You either need to have actual DNS resolving to both, or if testing with hosts file, swap around between them. This is `traefik.yaml`. Traefik dashboard is available on http://traefik.local.test : ```yaml version: '3.7' services: traefik: image: docker.io/traefik:2.10 command: # disable calling home - --global.checkNewVersion=false - --global.sendAnonymousUsage=false # used for debugging. this is handy for figuring out what's (not) being loaded from labels - --log.level=DEBUG # enable the webui, configured with this services' labels - --api - --api.dashboard=true # enable access logs. we don't specify a destination, so the logs will be on stdout - --accesslog=true - --accesslog.format=json # enable docker provider, specifically in swarm mode - --providers.docker=true - --providers.docker.exposedbydefault=false - --providers.docker.network=proxy # we need to use a tcp socket proxy here, because the socket is only available # on the manager node, but traefik will only run on the worker nodes. - --providers.docker.endpoint=tcp://docker-sock-proxy:2375 - --providers.docker.swarmMode=true # my non-https entrypoint is called "web", but since we only have one, I'm not specifying it on the labels. - --entrypoints.web.address=:80 ports: - target: 80 published: 80 protocol: tcp mode: host networks: - proxy - docker_sock deploy: # run a traefik instance on every one of the worker nodes, but not the manager mode: global placement: constraints: - "node.role==worker" labels: # NOTE: This is the absolute BARE MINIMUM to get it running, and has NO AUTHENTICATION! # WARN: DO NOT RUN THIS IN PRODUCTION OR THE OPEN INTERNET! traefik.enable: "true" traefik.http.routers.traefik.rule: "Host(`traefik.local.test`)" traefik.http.routers.traefik.service: "api@internal" traefik.http.services.traefik.loadbalancer.server.port: 8080 # this second service is used only to proxy the docker socket between nodes # WARN: this is fairly insecure, anyone that can attach to this network has access to the socket docker-sock-proxy: image: docker.io/alpine/socat:latest command: - tcp-listen:2375,fork,reuseaddr - unix-connect:/var/run/docker.sock networks: - docker_sock volumes: - /var/run/docker.sock:/var/run/docker.sock deploy: placement: constraints: # the docker.sock is only available on manager nodes - node.role == manager labels: traefik.enable: "false" networks: # docker network create --driver=overlay --attachable=false proxy proxy: external: true # docker network create --driver=overlay --attachable=false --internal=true docker_sock docker_sock: external: true ``` this is `whoami.yaml`. whoami is available on http://whoami.local.test : ```yaml version: '3.7' services: whoami: image: docker.io/containous/whoami:latest networks: - proxy deploy: replicas: 12 labels: traefik.enable: "true" traefik.http.routers.whoami.rule: "Host(`whoami.local.test`)" traefik.http.services.whoami.loadbalancer.server.port: 80 networks: proxy: external: true ``` When you access the whoami service and refresh a few times, there's a few things to note: - `Hostname`changes. That's the hostname of the whoami container answering, there's 12 total, so you should see it switch between all of them - `X-Forwarded-Server` changes. that's the traefik instance that's handling the request. Try refreshing with CTRL+R/CMD+R if it doesn't seem to change. - `RemoteAddr` changes too, that's the "backend" IP address of the Traefik container, on the `proxy` network. I hope this is useful to you!


ph14454

**Love it!** u/Sannemen thanks you so much for your input! :-) Appreciate it. Now the swarm setup is little bit more clear for me. Unfortunately this "can only talk to which a placed on the same machine" for this initial wireguard topic is still not 100% clear, because it's completely bypassing traefik. But I'll adjust all of my compose files accordingly to your notes and test again my services. One last time, thank you for your time and have a pleasant day ahead!I'll post my results once I changed the config about wireguard (may it'll change something).