[TriLUG] copying files

Joseph Mack NA3T jmack at wm7d.net
Tue Jun 19 22:15:42 EDT 2012


On Tue, 19 Jun 2012, Jeff Schornick wrote:

> On Tue, Jun 19, 2012 at 9:39 PM, Joseph Mack NA3T <jmack at wm7d.net> wrote:
>> I haven't used rsync. So after the initial phase, both ends know the files
>> at each end and when I add a new file at one end, rsync will notice and just
>> handle it?
>
> Not quite.
>
> On each synchronization run, rsync creates a local list from the
> source directory, while simultaneously creating the analogous list on
> the remote end.  This means if you have 1000 files, you may be looking
> at 1000 fstats on each end.  However, these checks are both done
> locally on the corresponding machines.  As long as the target system's
> local file I/O isn't significantly slower than the source machine's,
> you shouldn't be introducing any additional delay.
>
> After both lists have been generated, rsync uses a minimal amount of
> network traffic to compare the lists and generate a final list of
> which files need to be updated.  As expected, only those files are
> sent over the network.
>
> After the synchronization is complete, the generated lists get tossed
> out as dirty laundry.  There is no long running daemon which attempts
> to keep them up-to-date in realtime.  However, I imagine someone has
> created a slick piece of code using inotify to do just that.

OK, so I'd have to invoke rsync every 5 mins. Assembling the 
list of files at each end has to be done anyhow (eg find). 
Presumably 1000 fstats take the same time no matter whether 
find or rsync then processes the list. The problem then is 
comparing the lists at each end.

cp -auv is really slow

rsync you say is fast (and I believe you).

but I already have my list from `find`, so there's no extra 
cost if I use find.

The copy of the files takes the same time no matter which 
way I assembled the list of files to be copied.

So `find` followed by `cp --parents` or `cpio` seems to be 
it.

Alan points out the resilience of rsync. This is a good 
feature, but as it turns out (and I didn't say this), I 
don't mind loosing an occassional file, but throughput is 
high priority. The backup machine is writing files from many 
sources and it only has a few seconds to service a source 
machine, or it will fall over with the load.

Joe

-- 
Joseph Mack NA3T EME(B,D), FM05lw North Carolina
jmack (at) wm7d (dot) net - azimuthal equidistant map
generator at http://www.wm7d.net/azproj.shtml
Homepage http://www.austintek.com/ It's GNU/Linux!



More information about the TriLUG mailing list