Skip to content

Conversation

@marcuslinke
Copy link

@marcuslinke marcuslinke commented Sep 14, 2017

WIP: Combines the efforts of #476 and #520 for supporting swarm mode services.

This PR implements registration of swarm mode vip services with one limitation: Ingress routed services (routed via routing mesh) will be registered on swarm manager nodes only. On worker nodes this is not possible at the moment due to the fact that swarm services can't be inspected via worker nodes remote API. Probably this isn't a big limitation as long as one have enough manager nodes within the swarm.

A short summarize of features added with this PR:

  • Register swarm vip services. Ingress routed services will be registered on manager nodes only while DnsRR services will be registered on nodes where containers are running. Updates to swarm vip services especially network specific reconfiguration (dnsrr/ingress) will be reflected in the registry backend.
  • Optional swarm manager service (Port 2376) registration was added (-swarm-manager-servicename option). This includes dynamic registration of all manager dependent services including ingress services. Promoting/demoting swarm nodes will be reflected dynamically.
  • Respect swarm service labels for tagging. This allows dynamic re-tagging registered services without container recreation (via docker service update --label-rm "SERVICE_80_TAGS=foo" --label-add "SERVICE_80_TAGS=bar")
  • Refresh ALL registered services when configured so (-ttl-refresh)
  • Better external ip address detection when deployed on swarm node. Usually there is no need for passing -ip option in this case anymore.
  • Allowing to pass interface name via -ip option (for example -ip eth0). This is useful on plain docker hosts to avoid hard-wiring a specific ip address with the registrator container.
  • Its possible now to configure multiple service names by using a comma separated list (for example SERVICE_NAME=foo,bar).

Two new options were added:

-swarm-manager-servicename string
    	Register swarm manager service when non-empty
-swarm-replicas-aware
    	Remove registered swarm services without replicas (default true)

Image for testing is here https://hub.docker.com/r/marcuslinke/registrator/

@progrium Please review. Thanks!

@tuan3w
Copy link

tuan3w commented Oct 2, 2017

Hi @marcuslinke
I try to test your image but I get this error

registrator_1  | 2017/10/02 03:07:31 Syncing swarm mode vip services. Swarm control available: true
registrator_1  | 2017/10/02 03:07:31 added: swarm vip service ip2loc:8089
registrator_1  | 2017/10/02 03:07:31 registered 1 services for swarm service 0aowu9zo26qk468q5vqs56l7b 
registrator_1  | 2017/10/02 03:07:31 added: swarm vip service viz:12000
registrator_1  | 2017/10/02 03:07:31 registered 1 services for swarm service 0o0uhmehpb5glo327surgtg6h 
registrator_1  | panic: runtime error: invalid memory address or nil pointer dereference
registrator_1  | [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x55f4c73dfce2]
registrator_1  | 
registrator_1  | goroutine 1 [running]:
registrator_1  | panic(0x55f4c7c34520, 0xc420014050)
registrator_1  | 	/usr/lib/go/src/runtime/panic.go:500 +0x1a5
registrator_1  | github.com/gliderlabs/registrator/bridge.(*Bridge).registerSwarmService(0xc4201b4dd0, 0xc420325de0, 0x19, 0xd49fd6, 0xed1611146, 0x11f6d6fc, 0x55f4c7f9efe0, 0xed1611146, 0x1243d25d, 0x55f4c7f9efe0, ...)
registrator_1  | 	/go/src/github.com/gliderlabs/registrator/bridge/bridge.go:641 +0x152
registrator_1  | github.com/gliderlabs/registrator/bridge.(*Bridge).syncSwarmVipServices(0xc4201b4dd0)
registrator_1  | 	/go/src/github.com/gliderlabs/registrator/bridge/bridge.go:536 +0x2cb
registrator_1  | github.com/gliderlabs/registrator/bridge.(*Bridge).syncSwarmServices(0xc4201b4dd0)
registrator_1  | 	/go/src/github.com/gliderlabs/registrator/bridge/bridge.go:489 +0x3b
registrator_1  | github.com/gliderlabs/registrator/bridge.(*Bridge).Sync(0xc4201b4dd0, 0x0)
registrator_1  | 	/go/src/github.com/gliderlabs/registrator/bridge/bridge.go:144 +0x4fb
registrator_1  | main.main()
registrator_1  | 	/go/src/github.com/gliderlabs/registrator/registrator.go:163 +0x9ac

I'm not familiar with go, so I don't know the problem. Can you fix it ?

@marcuslinke
Copy link
Author

Thanks for your feedback @tuan3w. The issue seems to be related to docker remote API incompatibilities. Personally tested against docker 17.06.2-ce (remote API version v1.30) and it worked. So what version of the docker daemon do you use? As a workaround could you try to configure -swarm-replicas-aware=false please?

@marcuslinke
Copy link
Author

@tuan3w Please copy/paste output of docker service inspect <SERVICE> here.

@tuan3w
Copy link

tuan3w commented Oct 2, 2017

Hi @marcuslinke,
I deploy registrator as a docker container, not a service. My docker version is 17.06.0-ce.

@marcuslinke
Copy link
Author

Registrator is deployed as container thats OK but it detects docker swarm vip services via docker remote API. According to your logs it tries to inspect a vip service with ID 0aowu9zo26qk468q5vqs56l7b (ip2loc:8089):

registrator_1  | 2017/10/02 03:07:31 Syncing swarm mode vip services. Swarm control available: true
registrator_1  | 2017/10/02 03:07:31 added: swarm vip service ip2loc:8089
registrator_1  | 2017/10/02 03:07:31 registered 1 services for swarm service 0aowu9zo26qk468q5vqs56l7b 

It would be great if you could copy/paste the output of
docker service inspect 0aowu9zo26qk468q5vqs56l7b. Maybe you have to replace the service id with the current one.

@tuan3w
Copy link

tuan3w commented Oct 2, 2017

The output is as follow:

[
    {
        "ID": "0aowu9zo26qk468q5vqs56l7b",
        "Version": {
            "Index": 13931681
        },
        "CreatedAt": "2017-06-24T03:02:58.330790262Z",
        "UpdatedAt": "2017-09-29T15:26:59.80807029Z",
        "Spec": {
            "Name": "ip2loc",
            "Labels": {},
            "TaskTemplate": {
                "ContainerSpec": {
                    "Image": "something/ip2loc:latest@sha256:177733283d4e9e205ca9e034676bcf398dd3fe1b073f25fab29e01b9c6d75d60",
                    "StopGracePeriod": 10000000000,
                    "DNSConfig": {}
                },
                "Resources": {},
                "RestartPolicy": {
                    "Condition": "any",
                    "Delay": 5000000000,
                    "MaxAttempts": 0
                },
                "Placement": {},
                "ForceUpdate": 0,
                "Runtime": "container"
            },
            "Mode": {
                "Replicated": {
                    "Replicas": 2
                }
            },
            "UpdateConfig": {
                "Parallelism": 1,
                "FailureAction": "pause",
                "Monitor": 5000000000,
                "MaxFailureRatio": 0,
                "Order": "stop-first"
            },
            "RollbackConfig": {
                "Parallelism": 1,
                "FailureAction": "pause",
                "Monitor": 5000000000,
                "MaxFailureRatio": 0,
                "Order": "stop-first"
            },
            "EndpointSpec": {
                "Mode": "vip",
                "Ports": [
                    {
                        "Protocol": "tcp",
                        "TargetPort": 8089,
                        "PublishedPort": 8089,
                        "PublishMode": "ingress"
                    }
                ]
            }
        },
        "PreviousSpec": {
            "Name": "ip2loc",
            "Labels": {},
            "TaskTemplate": {
                "ContainerSpec": {
                    "Image": "something/ip2loc:latest@sha256:177733283d4e9e205ca9e034676bcf398dd3fe1b073f25fab29e01b9c6d75d60"
                },
                "Placement": {},
                "ForceUpdate": 0,
                "Runtime": "container"
            },
            "Mode": {
                "Replicated": {
                    "Replicas": 1
                }
            },
            "UpdateConfig": {
                "Parallelism": 1,
                "FailureAction": "pause",
                "MaxFailureRatio": 0,
                "Order": "stop-first"
            },
            "EndpointSpec": {
                "Mode": "vip",
                "Ports": [
                    {
                        "Protocol": "tcp",
                        "TargetPort": 8089,
                        "PublishedPort": 8089,
                        "PublishMode": "ingress"
                    }
                ]
            }
        },
        "Endpoint": {
            "Spec": {
                "Mode": "vip",
                "Ports": [
                    {
                        "Protocol": "tcp",
                        "TargetPort": 8089,
                        "PublishedPort": 8089,
                        "PublishMode": "ingress"
                    }
                ]
            },
            "Ports": [
                {
                    "Protocol": "tcp",
                    "TargetPort": 8089,
                    "PublishedPort": 8089,
                    "PublishMode": "ingress"
                }
            ],
            "VirtualIPs": [
                {
                    "NetworkID": "t9uds2hh4tbbi8n95vh64832k",
                    "Addr": "10.255.0.2/16"
                }
            ]
        }
    }
]

@marcuslinke
Copy link
Author

Thanks @tuan3w. Output looks okay to me. So currently I don't know whats the problem then. Do you use the latest image pushed (7f39f8786cf7)? What if you configure -swarm-replicas-aware=false ?

@tuan3w
Copy link

tuan3w commented Oct 2, 2017

Yes, I'm using the latest image. If I use your configuration, the exception doesn't throws. However, it doesn't recognize service, both in global mode and replicate mode.

@marcuslinke
Copy link
Author

So no services are registered in the backend? Hmmm, could you post your registrator logs please? Btw. what registry backend do you use? I've tested with consul here.

And how do you deploy these services? Do you use a .yml file together with docker stack deploy that you could post here? Or do you use docker service create ? Maybe there are are some subtle differences when deploy via docker service create. Hopefully I'm able to reproduce the problem then.

@tuan3w
Copy link

tuan3w commented Oct 2, 2017

something like this:

registrator_1  | 2017/10/02 15:37:02 Syncing swarm mode vip services. Swarm control available: true
registrator_1  | 2017/10/02 15:37:02 added: swarm vip service ip2loc:8089
registrator_1  | 2017/10/02 15:37:02 registered 1 services for swarm service 0aowu9zo26qk468q5vqs56l7b 
registrator_1  | 2017/10/02 15:37:02 added: swarm vip service viz:12000
registrator_1  | 2017/10/02 15:37:02 registered 1 services for swarm service 0o0uhmehpb5glo327surgtg6h 
registrator_1  | 2017/10/02 15:37:02 added: swarm vip service notifygrafana:20000
registrator_1  | 2017/10/02 15:37:02 registered 1 services for swarm service ajl5894vydqk6e49uvkht4jld 
registrator_1  | 2017/10/02 15:37:02 added: swarm vip service nginx3:24001
registrator_1  | 2017/10/02 15:37:02 registered 1 services for swarm service brqmnsz872zjlafvhd3627fh1 
registrator_1  | 2017/10/02 15:37:02 added: swarm vip service monitor_grafana:28000
registrator_1  | 2017/10/02 15:37:02 registered 1 services for swarm service d3c6u6wwjo31li2qpniwzefae 
registrator_1  | 2017/10/02 15:37:02 added: swarm vip service cross-device-server:11000
registrator_1  | 2017/10/02 15:37:02 registered 1 services for swarm service obrf6cskei8dofslrnoehgfm2 
registrator_1  | 2017/10/02 15:37:02 added: swarm vip service abtesting:10080
registrator_1  | 2017/10/02 15:37:02 registered 1 services for swarm service ohe38k50ft9uiwrsg0746p3as 
registrator_1  | 2017/10/02 15:37:02 added: swarm vip service monitor_influx:8086
registrator_1  | 2017/10/02 15:37:02 registered 1 services for swarm service qolccb4mjvr5obbd9955megw4 
registrator_1  | 2017/10/02 15:37:02 added: swarm vip service consul_cluster:8500
registrator_1  | 2017/10/02 15:37:02 added: swarm vip service consul_cluster:8600
registrator_1  | 2017/10/02 15:37:02 registered 2 services for swarm service tpk3vx2t4do90zcindu1052rm 
registrator_1  | 2017/10/02 15:37:02 added: swarm vip service abtsync:15000
registrator_1  | 2017/10/02 15:37:02 registered 1 services for swarm service z1wt81y1wwxnd3uff9nn1e2mn 
registrator_1  | 2017/10/02 15:37:08 event: container create 639f345241d8087ea2a14d186df29513fdffdeff484679198dbb11f18fa79956
registrator_1  | 2017/10/02 15:37:08 event: container attach 639f345241d8087ea2a14d186df29513fdffdeff484679198dbb11f18fa79956
registrator_1  | 2017/10/02 15:37:08 event: container exec_create: /bin/sh -c /usr/bin/healthcheck.sh fd08c614d05ca41e4cd6cb93a9283e5d75579db301afde302184ebd032dd2c43
registrator_1  | 2017/10/02 15:37:09 event: container 639f345241d8087ea2a14d186df29513fdffdeff484679198dbb11f18fa79956 started
registrator_1  | 2017/10/02 15:37:09 added: 639f345241d8 srv34183:frosty_bassi:80
registrator_1  | 2017/10/02 15:37:09 added: 639f345241d8 srv34183:frosty_bassi:80
registrator_1  | 2017/10/02 15:37:32 event: container 639f345241d8087ea2a14d186df29513fdffdeff484679198dbb11f18fa79956 killed
registrator_1  | 2017/10/02 15:37:32 removed: 639f345241d8 srv34183:frosty_bassi:80
registrator_1  | 2017/10/02 15:37:32 removed: 639f345241d8 srv34183:frosty_bassi:80
registrator_1  | 2017/10/02 15:37:32 event: container 639f345241d8087ea2a14d186df29513fdffdeff484679198dbb11f18fa79956 died
registrator_1  | 2017/10/02 15:37:32 event: container destroy 639f345241d8087ea2a14d186df29513fdffdeff484679198dbb11f18fa79956
registrator_1  | 2017/10/02 15:37:37 event: container create 0e1201cad3adef79b2739f0107217028948c30d31d23630277ed157c57a5c631
registrator_1  | 2017/10/02 15:37:37 event: container attach 0e1201cad3adef79b2739f0107217028948c30d31d23630277ed157c57a5c631
registrator_1  | 2017/10/02 15:37:38 event: container 0e1201cad3adef79b2739f0107217028948c30d31d23630277ed157c57a5c631 started
registrator_1  | 2017/10/02 15:37:38 added: 0e1201cad3ad srv34183:angry_pare:80
registrator_1  | 2017/10/02 15:37:38 added: 0e1201cad3ad srv34183:angry_pare:80
registrator_1  | 2017/10/02 15:37:38 event: container exec_create: /bin/sh -c /usr/bin/healthcheck.sh fd08c614d05ca41e4cd6cb93a9283e5d75579db301afde302184ebd032dd2c43
registrator_1  | 2017/10/02 15:37:57 event: container 0e1201cad3adef79b2739f0107217028948c30d31d23630277ed157c57a5c631 killed
registrator_1  | 2017/10/02 15:37:57 removed: 0e1201cad3ad srv34183:angry_pare:80
registrator_1  | 2017/10/02 15:37:57 removed: 0e1201cad3ad srv34183:angry_pare:80
registrator_1  | 2017/10/02 15:37:57 event: container 0e1201cad3adef79b2739f0107217028948c30d31d23630277ed157c57a5c631 died
registrator_1  | 2017/10/02 15:37:58 event: container destroy 0e1201cad3adef79b2739f0107217028948c30d31d23630277ed157c57a5c631
registrator_1  | 2017/10/02 15:38:08 event: container exec_create: /bin/sh -c /usr/bin/healthcheck.sh fd08c614d05ca41e4cd6cb93a9283e5d75579db301afde302184ebd032dd2c43
registrator_1  | 2017/10/02 15:38:08 event: container exec_start: /bin/sh -c /usr/bin/healthcheck.sh fd08c614d05ca41e4cd6cb93a9283e5d75579db301afde302184ebd032dd2c43
registrator_1  | 2017/10/02 15:38:39 event: container exec_create: /bin/sh -c /usr/bin/healthcheck.sh fd08c614d05ca41e4cd6cb93a9283e5d75579db301afde302184ebd032dd2c43
registrator_1  | 2017/10/02 15:38:39 event: container exec_start: /bin/sh -c /usr/bin/healthcheck.sh fd08c614d05ca41e4cd6cb93a9283e5d75579db301afde302184ebd032dd2c43
registrator_1  | 2017/10/02 15:39:09 event: container exec_create: /bin/sh -c /usr/bin/healthcheck.sh fd08c614d05ca41e4cd6cb93a9283e5d75579db301afde302184ebd032dd2c43
registrator_1  | 2017/10/02 15:39:39 event: container exec_create: /bin/sh -c /usr/bin/healthcheck.sh fd08c614d05ca41e4cd6cb93a9283e5d75579db301afde302184ebd032dd2c43
registrator_1  | 2017/10/02 15:40:09 event: container exec_create: /bin/sh -c /usr/bin/healthcheck.sh fd08c614d05ca41e4cd6cb93a9283e5d75579db301afde302184ebd032dd2c43
registrator_1  | 2017/10/02 15:40:09 event: container create e0ac474dca88f74c48df16ec3ddcc91689f1c32cfdd6580fb3dfc5d419161c02
registrator_1  | 2017/10/02 15:40:10 event: container e0ac474dca88f74c48df16ec3ddcc91689f1c32cfdd6580fb3dfc5d419161c02 started
registrator_1  | 2017/10/02 15:46:18 ignored: 47f52581d595 port 80 not published on host
registrator_1  | 2017/10/02 15:46:18 ignored: 47f52581d595 port 80 not published on host
registrator_1  | 2017/10/02 15:46:40 event: container exec_create: /bin/sh -c /usr/bin/healthcheck.sh fd08c614d05ca41e4cd6cb93a9283e5d75579db301afde302184ebd032dd2c43
registrator_1  | 2017/10/02 15:47:10 event: container exec_create: /bin/sh -c /usr/bin/healthcheck.sh fd08c614d05ca41e4cd6cb93a9283e5d75579db301afde302184ebd032dd2c43
registrator_1  | 2017/10/02 15:47:40 event: container exec_create: /bin/sh -c /usr/bin/healthcheck.sh fd08c614d05ca41e4cd6cb93a9283e5d75579db301afde302184ebd032dd2c43
re

THe first two lines correspond to when I deploy service using replicate mode. The rest with global mode.
I also includes my docker-compose.yml file:

version: "2"
services:
 registrator:
  restart: always
  network_mode: host
  image: "marcuslinke/registrator:2017-09-30"
  volumes:
    - /var/run/docker.sock:/tmp/docker.sock
  command: "-ip 10.3.34.183 -swarm-replicas-aware=false consul://10.3.34.183:8500"

@marcuslinke
Copy link
Author

marcuslinke commented Oct 2, 2017

Thats strange. In your first log there were the folllowing lines:

registrator_1  | 2017/10/02 03:07:31 Syncing swarm mode vip services. Swarm control available: true
registrator_1  | 2017/10/02 03:07:31 added: swarm vip service ip2loc:8089
registrator_1  | 2017/10/02 03:07:31 registered 1 services for swarm service 0aowu9zo26qk468q5vqs56l7b 
registrator_1  | 2017/10/02 03:07:31 added: swarm vip service viz:12000
registrator_1  | 2017/10/02 03:07:31 registered 1 services for swarm service 0o0uhmehpb5glo327surgtg6h 

You have to see this also when you configure -swarm-replicas-aware=false. Beside this I need the exact docker service create command of your global/replicated service. Thanks!

@tuan3w
Copy link

tuan3w commented Oct 2, 2017

I updated log output a little more. @marcuslinke , you can check it now.

@marcuslinke
Copy link
Author

The first 22 lines of the log saying there where a bunch of vip services registered. And none of them appearing in the consul catalog after that? Thats very strange! What version of consul do you use?

@marcuslinke
Copy link
Author

Okay. Probably found the reason for the exception you mentioned above. Will push a new docker image soon.

@tuan3w
Copy link

tuan3w commented Oct 2, 2017

Hi @marcuslinke,
I tested your new image, but it seem didn't work. If I test creating a service with global mode, It only recogizes instances that are manager nodes:

[
{
Node: "srv34183",
Address: "10.3.34.183",
ServiceID: "swarm-vip-svc:hxsrbexcf8ho0ixo0uophpq57:srv34183:80",
ServiceName: "nginx4-80",
ServiceTags: [
"vip-outside"
],
ServiceAddress: "10.3.34.183",
ServicePort: 24003
},
{
Node: "srv34183",
Address: "10.3.34.183",
ServiceID: "swarm-vip-svc:hxsrbexcf8ho0ixo0uophpq57:srv34184:80",
ServiceName: "nginx4-80",
ServiceTags: [
"vip-outside"
],
ServiceAddress: "10.3.34.184",
ServicePort: 24003
}
]

If I test with replicate mode, it doesn't recognize any instance at all. The logs of
A little curiosity, did you you use docker API to get addresses of all instances correspond to a service, something like this ?
My consul version is 0.9.3.

@marcuslinke
Copy link
Author

marcuslinke commented Oct 3, 2017

@tuan3w Registration on manager nodes only is a current architectural limitation. May be we can fix it sometimes but for now registrator (and especially its registry backend implementations for etcd, zookeeper, consul) doesn't allow cross node registrations. Each registrator instance can only register services that are running on the own node. Combined with the fact that on docker swarm worker nodes the appropriate endpoints of the docker remote API (like /services) are not available its not possible atm to register vip services on worker nodes. See initial post of this conversation.

However, the current implementation should register replicated services on all manager nodes nevertheless which in your case seems not to work. Please provide an example how you create such a replicated service along with an appropriate docker service inspect ... dump. Thanks.

@rosebude
Copy link

rosebude commented Oct 7, 2017

I am very pleased to test this version. This option lacks the registrator.

@a0s
Copy link

a0s commented Oct 7, 2017

This version of the registrator works good for me. Thank you!

@marcuslinke
Copy link
Author

@a0s Nice to hear that it works for you. But please be aware that this version is not tested against different docker versions / environments. Seems @tuan3w had some problems with it. Really needs more testing. @rosebude Looking foward for your test results!

@tuan3w
Copy link

tuan3w commented Oct 7, 2017

I implemented my own swarm service registrator . Maybe it can help others.

@rosebude
Copy link

rosebude commented Oct 7, 2017

With :
Docker Client/Version
Version: 17.09.0-ce

I' m using the service deploy mode like that:

version: "3.3"
services:
registrator:
image: "marcuslinke/registrator:2017-09-30"
deploy:
mode: global
volumes:
- /var/run/docker.sock:/tmp/docker.sock
command: "-cleanup -swarm-replicas-aware=false consul://192.168.1.66:8500"

1-
i'm obliged to specify : -swarm-replicas-aware=false like tuan3w commented 6 days ago

2-
I use registrator as service ' mode global' to register all service in every node. But i don't like specify the same IP for every registrator... Before, in conatiner mode, i used the localhost, but registrator in swarm mode don't work with the local consul address.

3-
When i use service update constraint mode to move container... Registrator is update but don't pull information in consul.... No error in docker service logs.

I'll continue next week...

Thank you.

@marcuslinke
Copy link
Author

@rosebude To be clear here: I don't deploy registrator itself as a global swarm service. Registrator is deployed together with the consul agent as containers on each node because consul agent needs to be configured with the nodes ip address (via -bind option) and node name (-node option) usually. So as consul-agent can't be deployed as global swarm service and needs to be deployed on each node "manually" registrator shouldn't as well as it depends on consul-agent.

@jonbrohauge
Copy link

I deploy my consul as a service, using som Go magic on figuring out the IP-address needed for -bind

docker service create --name consul-cluster --hostname "consul-cluster-{{ .Node.ID }}" --network consul --network proxy --env "SERVICE_TAGS=consul-cluster" --env "SERVICE_NAME=consul" --constraint 'node.role==manager' --label com.df.notify=true --label com.df.distribute=true --label com.df.serviceDomain=consul.mynetwork.local --label com.df.port=8500 --replicas 3 consul agent -server -ui -retry-join=consul-seed:8301 -retry-join=consul-cluster:8300 -client=0.0.0.0 -bind='{{ GetAllInterfaces | include "network" "10.0.2.0/24" | include "type" "ipv4" | attr "address" }}' --datacenter mynetwork

Explanatory details :

--hostname - so I can actually get something readable in the hostname of the container
--network=consul - the overlay network we have set up so our consul-cluster kan gossip over a network without disturbing anything else. In this case the network i 10.0.2.0/24
--network=proxy - a overlay network we have set up so we can avoid having to expose a bunch of ports
--label com.df.* - proxy related stuff to allow for proper routing through the proxy.
-retry-join=consul-seed:8301 - Due to selecting a leader sometimes screws up for our three-node cluster, we prime with a single consul server

Basically using -internal we get tons of services registered, maybe too many, and properly not all the correct ones 😁 . I haven't deployed registrator as a swarm service, yet. Primarily due to the --net=host constraint which does not work when deploying as a swarm service. I haven't tried to deploy "your" fork as a swarm service, yet.

@jonbrohauge
Copy link

@marcuslinke In regards to proxy, we use Docker Flow Proxy
We are bound to use our company AD/DNS, so we have no need to use consul as a DNS-service (at the moment anyway).

@a0s
Copy link

a0s commented Oct 14, 2017

BTW, for using registrator + consul + docker-flow-proxy. In this scheme (for me) consul is dns resolver only, port is default (80/443) and settings by docker-flow-proxy. So, i used fake port without any real port exposed:

services:
  my_service:
    environment:
      SERVICE_12099_NAME: "my_service_${UNIQ_NAME}"
    ports:
      - "12099"
    networks:
      - proxy
    deploy:
      labels:
        - com.df.serviceDomain="my-service-${UNIQ_NAME}.service.consul"

After that http://my-service-${UNIQ_NAME}.service.consul is working in a browser.

Marcus Linke added 5 commits October 16, 2017 09:05
This is useful only when using registrator together with plain docker
hosts (no swarm mode).
Service information seems replicated asynchronously to each swarm mode.
So there are situations where service events reference services that are
not replicated already to the node and therefore can't be inspected
instantly.
@jonbrohauge
Copy link

jonbrohauge commented Nov 6, 2017

Tried again with a few changes:
The main changes are

  • Network option encrypted=true
  • consul version 1.0.0
  • Getting proper IP-Address using GetPrivateInterfaces as input in template instead of GetAllInterfaces. This is due to I kept getting hold of the loopback adapter
  • The --label com.df.* are the same
  • Using --env "SERVICE_IGNORE=true" to have registrator ignore it
  • Using your latest Docker image version 2017-10-25

Create Overlay Network:

docker network create --driver overlay --opt encrypted=true consul

Create Consul Seed server

docker service create --name consul-seed --hostname consul-seed -p 8301:8300 --network consul --env "SERVICE_IGNORE=true" --constraint 'node.role==worker' consul:1.0.0 agent -server -bootstrap-expect=3 -retry-join=consul-seed:8301 -retry-join=consul-server:8300 -bind='{{ GetPrivateInterfaces | include "network" "10.0.1.0/24" | include "type" "ipv4" | attr "address" }}' -client=0.0.0.0

Create Consul server agents on manager nodes

docker service create --name consul-server --hostname "consul-server-{{ .Node.ID }}" --network consul --network proxy --env "SERVICE_IGNORE=true" --constraint 'node.role==manager' --replicas 3 consul:1.0.0 agent -server -retry-join=consul-seed:8301 -retry-join=consul-server:8300 -bind='{{ GetPrivateInterfaces | include "network" "10.0.1.0/24" | include "type" "ipv4" | attr "address" }}'

Create Consul agents on worker nodes

docker service create --name consul-client --hostname "consul-client-{{ .Node.ID }}" --network consul --network proxy --env "SERVICE_IGNORE=true" --constraint 'node.role==worker' --label com.df.notify=true --label com.df.distribute=true --label com.df.serviceDomain=consul.cde.domain.local --label com.df.port=8500 consul:1.0.0 agent -ui -retry-join=consul-server -client='{{ GetPrivateInterfaces | include "network" "10.0.0.0/24" | include "type" "ipv4" | attr "address" }}' -bind='{{ GetPrivateInterfaces | include "network" "10.0.1.0/24" | include "type" "ipv4" | attr "address" }}'

Create registrator service on all nodes

docker service create --name=registrator --mode=global --network=proxy --mount type=bind,src=/var/run/docker.sock,dst=/tmp/docker.sock marcuslinke/registrator:2017-10-25 -cleanup -deregister=always -resync 60 -internal=true -swarm-replicas-aware=true consul://consul.cde.domain.local

Once the Consul Server agents are running with Quorum, remove Consul Seed.

@marcuslinke
Copy link
Author

For the files: I use the following compose file to deploy these services via docker stack deploy:

version: '3.3'

services:
  consul-agent:
    image: consul:0.9.3
    networks:
      - outside
    environment:
     - SERVICE_IGNORE=true
     - CONSUL_BIND_INTERFACE=eth0
     - CONSUL_CLIENT_ADDRESS=0.0.0.0
    volumes: 
     - consul-agent:/consul/data
    deploy:
      mode: global
    command:
     - agent
     - -join=consul.service  

  registrator:
    image: marcuslinke/registrator:2017-10-25
    networks:
      - outside
    volumes: 
     - /var/run/docker.sock:/tmp/docker.sock
    deploy:
      mode: global
    command:
     - -cleanup
     - -resync=10
     - -swarm-manager-servicename=swarm
     - consul://localhost:8500 

volumes:
  consul-agent: 

networks:
  outside:
    external:
      name: "host"

@jonbrohauge
Copy link

Taken from your registrator service definition:
--- SNIP ---
- -swarm-manager-servicename=swarm
--- SNIP ---
What does this do?

@marcuslinke
Copy link
Author

marcuslinke commented Nov 24, 2017

When this option is set registrator registers the swarm manager service (Port 2367) within consul under the given service name. This is useful when using consul DNS capabilities to resolve a swarm manager IP from a common service name.

To administrate a swarm you need to connect to one of the manager nodes. You can't "hardcode" this node name in your (deployment) scripts because a node maybe down for maintainance for example. So this way our scripts can dynamically resolve a running manager node to connect to.

@prianna
Copy link

prianna commented Dec 5, 2017

Related: #596

Allows you to use the IP assigned by the namespaced ingress network for the app being deployed to Swarm. Tested that PR w/ Consul - it has been working pretty well for me.

Set up is a multinode Swarm cluster with Replicated/Consul on each host, deployed to workers/managers as part of the same Docker stack.

@a0s
Copy link

a0s commented Feb 4, 2018

I have 3 swarm nodes. And service started with --replicas 1 in ingress mode. Using --env SERVICE_443_IGNORE=true remove (from consul) only node where service running currently, not other two nodes.

docker service create \
    --detach \
    --name registrator \
    --network host \
    --mount type=bind,source=/var/run/docker.sock,destination=/tmp/docker.sock \
    --mode global \
    marcuslinke/registrator:2017-10-25 \
    -cleanup=true \
    -ttl=120 \
    -ttl-refresh=55 \
    -resync=30 \
    -swarm-manager-servicename=swarm \
    consul://localhost:8500

@marcuslinke
Copy link
Author

marcuslinke commented Feb 4, 2018

@a0s Could you provide an example how you started the service with --replicas 1 so I can reproduce please? As I understand you have updated the already started service with --env SERVICE_443_IGNORE=true, right?

@a0s
Copy link

a0s commented Feb 4, 2018

docker service create \
    --detach \
    --name proxy \
    --network proxy \
    --publish published=80,target=80 \
    --publish published=443,target=443 \
    --env LISTENER_ADDRESS=swarm-listener \
    --env STATS_USER=stats \
    --env STATS_PASS=stats \
    --env SERVICE_443_IGNORE=true \
    --replicas 1 \
    vfarcic/docker-flow-proxy

@genebean
Copy link

This is working great for me! That said, I do wish the vip registration would honor the SERVICE_NAME environment variable on swarm.

Currently it seems to honor the SERVICE_<port>_NAME one on both swarm and standalone hosts but the one without the port number in it only works on standalone hosts.

@josegonzalez
Copy link
Member

@marcuslinke mind updating the Dockerfile to use dep for dependency management? Feel free to remove my merge commit from this pr.

@jstewart612
Copy link

Ping pong. Any updates, @marcuslinke ?

@marcuslinke
Copy link
Author

Sorry. I totally missed this. Currently I'm a bit busy and don't have time to maintain this PR. @josegonzalez Is there anything I can do to get this merged?

Marcus Linke added 2 commits March 27, 2019 20:06
@marcuslinke
Copy link
Author

Just pushed a new image version 'marcuslinke/registrator:2019-03-28' that solves some issues.

@brunocascio
Copy link

Hey folks! @marcuslinke are you still using it?

I'm trying to integrate it with traefik:

anotherone:
    image: nginx:alpine
    ports:
      - "80"
    environment:
      - SERVICE_TAGS=traefik.enable=true,traefik.http.routers.app_anotherone.rule=PathPrefix(`/anotherone`),traefik.http.routers.app_anotherone.service=app_anotherone
      - SERVICE_NAME=app_anotherone
    deploy:
      replicas: 6
      placement:
        preferences:
          - spread: node.id
      update_config:
        parallelism: 0

It's registered to consul, but for some reason SERVICE_TAGS are not added to consul, so at the end not configured in traefik (using consul catalog provider)

@brunocascio
Copy link

anotherone:
    image: nginx:alpine
    ports:
      - "80"
    environment:
      - SERVICE_TAGS=traefik.enable=true,traefik.http.routers.app_anotherone.rule=PathPrefix(`/anotherone`),traefik.http.routers.app_anotherone.service=app_anotherone
      - SERVICE_NAME=app_anotherone
    deploy:
      replicas: 6
      placement:
        preferences:
          - spread: node.id
      update_config:
        parallelism: 0

I fixed it in this way:

anotherone:
    image: nginx:alpine
    ports:
      - "80"
    deploy:
      replicas: 6
      placement:
        preferences:
          - spread: node.id
      update_config:
        parallelism: 0
      labels:
       - SERVICE_80_TAGS=traefik.enable=true,traefik.http.routers.app_anotherone.rule=PathPrefix(`/anotherone`),traefik.http.routers.app_anotherone.service=app_anotherone
       - SERVICE_NAME=app_anotherone

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants