hard links - Re: [TriLUG] mkfs vs mke2fs

David Rasch rasch at raschnet.com
Fri May 14 13:49:45 EDT 2004


On Fri, May 14, 2004 at 01:38:12PM -0400, Tanner Lovelace <lovelace at wayfarer.org> wrote:
> David Rasch said the following on 5/14/04 1:13 PM:
> >
> >With an inode -> filename hash, couldn't this be done linearly O(N) ?
> >Check each file against the hash looking for a file already using this
> >inode?
> 
> Theoretically, but the data structures could get tricky.  You'd need to
> sure the hashing function was large enough so that there weren't collisions.
> But, since the inode is a number, the perfect (as far as no collisions go)
> hashing function is the identity function.  But, then how do you store it.
> If you allocate an array for all possible values, then yes, you can do
> it O(N) by just checking that bucket but at the cost of a huge amount
> of memory.  With each bit you lower the hash function you decrease memory
> requirements but also increase the number of things you must check to
> avoid collisions.  It would be interesting to see the curve that
> results from this tradeoff.  You have to pay somewhere, be it memory or
> operations.

Well, collisions in hash functions aren't the kiss of death, and a
non-identity hashing functions could yield very good results.  Clearly
rsync knows something about hashes as it uses them for determining what
parts of the file need transfered. As for memory, having spent some time
modding rsync, I can tell you that one of rsync's main flaws is keeping
the entire file transfer list in memory, and storing the inode for each
file wouldn't be a big change.  Since this is all a theoretical
discussion, I was just pointing out that the algorithm is not
necessarily O(N^2) and could also be computed _before_ the file transfer
and the memory for the above hash could be then discarded.

David
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://www.trilug.org/pipermail/trilug/attachments/20040514/ae3be3e6/attachment.pgp>


More information about the TriLUG mailing list