[TriLUG] bash help

Jeremy Portzer jeremyp at pobox.com
Thu Oct 28 11:40:53 EDT 2004


Some good suggestions here.  I agree that writing this in Perl or Python
etc. would probably give you the best flexibility, but regardless, you
should look into the "find" program to find the old files that you wish
to delete.  Find has a myriad of options for selecting, sorting, etc,
that should do what you need.  Tip:  use the -print0 flag to output the
files separated by NULL characters, and then pipe that to "xargs -0" to
do the actual deletion.  Using xargs is much faster than the "-exec"
stuff within find itself, and using the NULL field will eliminate
problems with filenames with spaces.

Also keep in mind that "du" takes quite a while to run as it goes
through the whole directory structure.  If the files are on their own
parition, you can simply use "df" instead, which returns instantly.

Jeremy

On Thu, 2004-10-28 at 11:31, rwshep2000.2725323 at bloglines.com wrote:
> Steve,
> 
> That was awesome advice.  I agree with you: there are better ways
> than BASH for this.  I had to look up "hysterisis" in the dictionary; thanks
> also for the contribution to my vocabulary!
> 
> Bob Shepherd
> 
> --- Triangle
> Linux Users Group discussion list <trilug at trilug.org wrote:
> On Thursday 28
> October 2004 10:11 am, rwshep2000.2725323 at bloglines.com wrote:
> > > Hi,
> >
> >
> > > I have a server with a shared repository for files.  I plan to devote
> 
> > > 70GB of an 80GB HD (a single data partition) to the files.  The files
> are
> > > uploaded and placed in the repository via a web application.  Here
> is what
> > > I'd like to accomplish:
> > >
> > > When directory size exceeds
> 70GB, delete files,
> > > First-In-First-Out, until the repository is pared
> back to 70GB.
> > >
> > > The best
> > > case scenario would be to pare back
> the files each time a new file is
> > > added. However, I am hoping to do this
> without adding web application
> > > logic, which could cause additional latency
> for the user.  Although it
> > > risks possibly exceeding the size limit, I
> am thinking of using a bash
> > > script scheduled with cron. To ensure against
> exceeding the limit, I'm
> > > leaving 10GB of the 80GB as buffer. I know this
> is imperfect but my humble
> > > intellect can't think of another approach.
> 
> > >
> > >
> > > So I'm looking for comments on two things:
> > >
> > > 1.  How
> to make a bash script
> > > look at total directory size, 
> > 
> > df
> > 
> >
> > then proceed to delete files FIFO until a 
> > > target size is reached;
> 
> > 
> > Loop: find the oldest, delete it, check disk space
> > 
> > Personally,
> I'd write this thing in Perl
> > >
> > > 2.  Whether there is a better alternative
> than putting
> > > this script on cron.
> > 
> > Put the disk space check in
> your webapp logic. If disk space is OK, your user 
> > has endured one if statement's
> worth of latency. If disk space is moderately 
> > low (still have 10 GB),
> have it run the deletion part in background, and your 
> > user has endured
> latency of one if statement and one background spawn. If 
> > disk space is
> critically low, throw up a page telling him to wait for disk 
> > maintenance,
> and run the deletion in the forground. User endures big latency, 
> > but at
> least his work doesn't get garbled. Presumably this won't happen 
> > often,
> because it will be caught before it gets critical.
> > 
> > I think I'd build
> in hysterisis like a thermostat. Run the deletion program at 
> > 7GB free,
> and have it shut off at 14GB free. That way it won't run too often, 
> > and
> won't oscillate as new files get written during its lifetime. Also, as 
> >
> the deletion program starts, have it set a flag so that no other deletion
> 
> > program starts during its lifetime. Upon the deletion program's termination
> 
> > (all termination points), have it unset the flag so other deletion programs
> 
> > can be run.
> > 
> > Once again, if it were me, I'd write the program in
> Perl or Python, whether 
> > the program is spawned as a cron job or from the
> web app.
> > 
> > HTH
> > 
> > SteveT
> > 
> > Steve Litt
> > Founder and acting
> president: GoLUG
> > http://www.golug.org
> > -- 
> > TriLUG mailing list   
>     : http://www.trilug.org/mailman/listinfo/trilug
> > TriLUG Organizational
> FAQ  : http://trilug.org/faq/
> > TriLUG Member Services FAQ : http://members.trilug.org/services_faq/
> 
> > TriLUG PGP Keyring         : http://trilug.org/~chrish/trilug.asc
> > 
-- 
/---------------------------------------------------------------------\
| Jeremy Portzer        jeremyp at pobox.com      trilug.org/~jeremy     |
| GPG Fingerprint: 712D 77C7 AB2D 2130 989F  E135 6F9F F7BC CC1A 7B92 |
\---------------------------------------------------------------------/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://www.trilug.org/pipermail/trilug/attachments/20041028/698a8b10/attachment.pgp>


More information about the TriLUG mailing list