[TriLUG] Disk space calculations in Linux

Scott G. Hall ScottGHall at BellSouth.Net
Fri Jan 25 15:25:48 EST 2008


<div class="moz-text-flowed" style="font-family: -moz-fixed">

Don't forget the '-k' option to ls -- it will list file sizes in blocks 
similar to the way du does (use "ls -lk").
However, I noticed that for directories, 'ls -lk' lists them as 0 size, 
but du correctly recognizes that
directories are just special files containing lists of other files, and 
take up space too -- you'll notice
that large directories need several blocks to contain a large list of files.

But since space is allocated in disk allocation units (ie. clusters for 
some filesystem types), you can
tweak the arguments to both ls and du by adding the "-B nnnn" argument, 
where nnnn is the size
of a disk allocation unit for that filesystem type.

Note that even with the du command and -B option, you still don't match 
the output of the df
command.  Remember that the overhead on disks include swap space 
partitions, space set aside
for alternate sectors for replacing ailing bad sectors in a filesystem, 
partition table and boot loader
space requirements.  And also remember that for some filesystem types 
there are "sparse" files
that contain no data in big chunks of the file, and so space is 
conserved and not allocated for those
bytes of a file that have no data.

Now mince that with persistent file-sharing data and transaction logging 
of filesystem writes and
disk usage starts to become a little convoluted at the least.

Jeremy Portzer wrote:
> William Sutton wrote:
>> While we're discussing... how much space gets wasted in overhead of 
>> files that allocate a particular block size but don't use all of the 
>> blocks?
>
> I think you mean, files that don't use all the bytes in a block.
>
> That is an important difference - du - disk usage - will list the 
> actual disk usage.  The output of du will always be in increments of 
> the file system block size (I'm not quite sure exactly how this is 
> determined, but in most of my ext3 filesystems, this unit seems to be 
> 4096 bytes, determined by running "dump2fs" - there may be simpler way 
> to show this).
>
> For example, the following three files have these sizes shown by ls:
>
> $ ls -l dump*
> -rw-r--r--    1 root     root            0 Jan 24 20:40 dump0.txt
> -rw-r--r--    1 root     root            1 Jan 24 20:40 dump1.txt
> -rw-r--r--    1 root     root       120176 Jan 24 20:34 dump-hda6.txt
> -rw-r--r--    1 root     root        77492 Jan 24 20:34 dump-hdc1.txt
> -rw-r--r--    1 root     root        40910 Jan 24 20:33 dump.txt
>
> But du shows this:
>
> $ du -b dump*
> 0       dump0.txt
> 4096    dump1.txt
> 126976  dump-hda6.txt
> 81920   dump-hdc1.txt
> 40960   dump.txt
>
> Notice that a zero-byte file takes zero space on disk, but a 1-byte 
> file takes 4096 bytes on disk, and all other files always use 
> increments of 4096.
>
> For this reason, when you care about the actual space on disk, you 
> should use "du" and not "ls".  This difference normally doesn't amount 
> to much, but it can if you have lots of very small files.
>
> Not sure if this answers a question anyone asked.  :-)
>
> --Jeremy
>
>

</div>

-- 
Scott G. Hall
Raleigh, NC, USA
ScottGHall at BellSouth.Net




More information about the TriLUG mailing list