[TriLUG] Replicating a filesystem across datacenters

Igor Partola igor at igorpartola.com
Tue Nov 4 19:54:49 EST 2014


I have not worked with file systems or distributed block devices, but I have worked with WAN distributed databases. Even that narrow use case has enough depth to it that building a solution took quite a bit of research and custom code.

The basic problem is speed of light. Going from DC to Dallas (that was my use case) meant a 30 ms RTT. This means that effectively your disks are suddenly very slow. Add the fact that partitions do happen, and that the network performance is not stable, and I think you can rule out pretty much any block device setup. The code written to run over block devices assumes some basic latency and predictable failure modes. This is similar to how stuff randomly hangs when you have wonky networking and NFS.

Also, distributed block devices cannot be shared: the original use case here is that there are writers in both data centers. If anything, some magical file system might do this.

More realistically, I would looks at what your applications are doing, if you can modify them and see if you can add multi-master support there. This is a horribly in-UNIX-y solution, but it is the current best approach for a reason: magic bullets don't exist so you have to create your own.

The only real alternative is hot standby. This is more reliable for the general case but much less exciting.

Igor


More information about the TriLUG mailing list