Imagine, that we have such cluster:
root@ip-172-16-14-154:~# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
9e0rtzevshatwop54579tjnwx new-master Ready Active Leader 19.03.8
ddlmhk2i6eyi1s9d5dt32i2da * old-master-01 Ready Active Reachable 19.03.8
lxj15zgc75os1bjpcwd8who6m old-master-02 Ready Active Reachable 19.03.8
and we want to split it into two parts: old with two master nodes ddlmhk2i
and lxj15zgc
and a new one on the node 9e0rtzev
. After splitting both clusters should have the same list of services and we will manually shutdown unused.
Stop Docker daemon on the node, which should be cut off from the cluster:
At this moment we can remove this node from our old cluster:
Now is time for some hacking. Let's edit two files on the new-master
node. First of them will be /var/lib/docker/swarm/state.json
:
[
{"node_id":"9e0rtzevshatwop54579tjnwx","addr":"172.16.14.14:2377"},
{"node_id":"ddlmhk2i6eyi1s9d5dt32i2da","addr":"172.16.14.154:2377"},
{"node_id":"lxj15zgc75os1bjpcwd8who6m","addr":"172.16.14.247:2377"}
]
We should remove from this file two old nodes and now it will look like:
And another file is /var/lib/docker/swarm/docker-state.json
:
{
"LocalAddr":"",
"RemoteAddr":"172.16.14.154:2377",
"ListenAddr":"0.0.0.0:2377",
"AdvertiseAddr":"",
"DataPathAddr":"",
"DefaultAddressPool":null,
"SubnetSize":0,
"DataPathPort":0,
"JoinInProgress":false
}
It can be different, but our task is just remove a value from the RemoteAddr
and write this node IP into the LocalAddr
value, something like this:
{
"LocalAddr":"172.16.14.14",
"RemoteAddr":"",
"ListenAddr":"0.0.0.0:2377",
"AdvertiseAddr":"",
"DataPathAddr":"",
"DefaultAddressPool":null,
"SubnetSize":0,
"DataPathPort":0,
"JoinInProgress":false
}
Now we can start Docker Daemon:
root@ip-172-16-14-14:~# systemctl start docker
root@ip-172-16-14-14:~# docker info
...
Swarm: pending
Error: rpc error: code = Unknown desc = The swarm does not have a leader. It's possible that too few managers are online. Make sure more than half of the managers are online.
Is Manager: true
Node Address: 172.16.14.14
Manager Addresses:
172.16.14.14:2377
We can see, that now this node doesn't know anything about old nodes, but it still wants to see other managers to operate properly. We can solve this with the next command:
Now we have two clusters with the same state and both of them can be operated independently. Just don't forget to remove old nodes and unused services from both of them.