Migrating data between ES clusters might seem like a simple thing -- after all, there are dedicated tools for that. Or one could use logstash with a simple config.
Things get a bit hairy, however, when the source and destination cluster versions differ wildly. Say, like 0.90.3
and 1.7.3
. And when you don't happen to have any admin access to the source cluster (only via HTTP
and transport
interfaces).And when the source cluster is misconfigured in just a slight but annoying manner...
What did not work
elasticdump
ElasticDump was the first and obvious thing to try, but apparently it only supports migrations between clusters running ElasticSearch 1.2 and higher. So, that's a no-go.
logstash
Apparently one can use logstash to migrate data between clusters. Unfortunately this solution did not work, either.
What did work
Please keep in mind that this procedure worked for us, but it doesn't have to work for you. Specifically, if the source is a cluster of more than one node, you might need to do some fancy shard allocation to make sure that all shards and all indices are copied over to the new node.
1. Create a new node
Why not cluster with that source ES server (running a single-node cluster) by creating a new node that we do control, and thus get the data? Getting the docker
container to run an ES version 0.90.3
was just a bit of manual fiddling with the Dockerfile. Changing the versions everywhere worked well, but here's hint: up until 1.0
or so, elasticsearch
ran in background by default. Thus, the docker container stopped immediately after starting elasticsearch
, for no apparent reason...
So if you're dealing with a pre-1.0
ES just add a -f
(for "foregroud") to the command in Dockerfile to save yourself a bit of frustration.
2. Cluster with the source server
Once we have this running, it's time to cluster with the source node. What could be easier? Disable multicast discovery, enable unicast with a specified host and we're good, right?
Wrong. Remember the "misconfigured in just a slight but annoying manner" thing? The source IP server turned out to be behind a NAT and the IP that we could connect to differed from the IP the server published. Hence, our new node discovered the source node as master, but then -- based on the info gotten from it -- tried connecting to it via its internal (10.x.y.z
) IP address. Which obviously did not work.
3. iptables to the rescue
As we had no way of changing the configuration of the source node, the only thing we could do was mangle the IP packets, so that packets going to 10.x.y.z
would have the destination address modified to the public IP of the source node (and those coming from the public IP of the source node would get modified to have the source address of 10.x.y.z
, but iptables
handled that for us automagically):
iptables -t nat -I PREROUTING -d 10.x.y.z -j DNAT --to-destination <external_IP_of_the_source_node>
We love one-liners, don't you?
4. Get the data
Once we had this working and confirmed that the cluster is now two nodes (souce node and our new node), we just sat back and watched the ~9GiB of data flow in. Once that was done, it was time to down the new node, disabled discovery altogether, and up it again to verify that we now have the data in there, on a node that we actually control.
5. Upgrade
Full cluster restart upgrade is what we had to do next, but for our single-node cluster that's just a fancy way of saying:
- down the node;
- upgrade ES to whatever version needed (
1.7.3
in our case); - up the node;
- verify everything is AOK.
6. Migrate the data to the production node
Since we were doing all this on a new node, created only to get the data off of the source ES 0.90.3
server, we needed to shunt the data off of it and into our production server (changing the index name in the process for good measure). This is where we turned back to elasticdump
, and using a simple script were able to successfuly migrate the data off of the new node and onto our production ES 1.7.3
server.
Of course things could not got smooth and easy here, either. The dump kept being interrupted by an EADDRNOTAVAIL error; a quick work-around was to use the --skip
command-line argument of elasticdump
to skip rows that have already been migrated.