[TriLUG] Disk space calculations in Linux

Owen oberry at trilug.org
Fri Jan 25 10:00:14 EST 2008


On Fri, Jan 25, 2008 at 12:42:32PM +1100, Jeremy Portzer wrote:
> William Sutton wrote:
> > While we're discussing... how much space gets wasted in overhead of files 
> > that allocate a particular block size but don't use all of the blocks?
> 
> I think you mean, files that don't use all the bytes in a block.
> 
> That is an important difference - du - disk usage - will list the actual 
> disk usage.  The output of du will always be in increments of the file 
> system block size (I'm not quite sure exactly how this is determined, 
> but in most of my ext3 filesystems, this unit seems to be 4096 bytes, 
> determined by running "dump2fs" - there may be simpler way to show this).
> 
> For example, the following three files have these sizes shown by ls:
> 
> $ ls -l dump*
> -rw-r--r--    1 root     root            0 Jan 24 20:40 dump0.txt
> -rw-r--r--    1 root     root            1 Jan 24 20:40 dump1.txt
> -rw-r--r--    1 root     root       120176 Jan 24 20:34 dump-hda6.txt
> -rw-r--r--    1 root     root        77492 Jan 24 20:34 dump-hdc1.txt
> -rw-r--r--    1 root     root        40910 Jan 24 20:33 dump.txt
> 
> But du shows this:
> 
> $ du -b dump*
> 0       dump0.txt
> 4096    dump1.txt
> 126976  dump-hda6.txt
> 81920   dump-hdc1.txt
> 40960   dump.txt
> 
> Notice that a zero-byte file takes zero space on disk, but a 1-byte file 
> takes 4096 bytes on disk, and all other files always use increments of 4096.
> 
> For this reason, when you care about the actual space on disk, you 
> should use "du" and not "ls".  This difference normally doesn't amount 
> to much, but it can if you have lots of very small files.
> 
> Not sure if this answers a question anyone asked.  :-)

Thanks Jeremy. Something else that is relevant to this subject is sparse
files.  As per the example from Wikipedia:

$ dd if=/dev/zero of=sparse-file bs=1 count=1 seek=1M
$ ls -lh sparse-file
-rw------- 1 oberry oberry 1.1M 2008-01-25 09:55 sparse-file
$ du -sh sparse-file
5.0K    sparse-file

The file is 1M in size but only occupies 5k of disk space. The most
dramatic real life example of this I've seen is with my rtorrent client
... initial file size may be a few hundered MB, but disk usage is only a
few MB's, which gradually increases as the file downloads.

Reading:
* http://en.wikipedia.org/wiki/Sparse_file
* http://en.wikipedia.org/wiki/Comparison_of_file_systems

Owen




More information about the TriLUG mailing list