[TriLUG] Scheduling file transfers

Mark Freeze mfreeze at gmail.com
Thu Apr 7 15:18:44 EDT 2005


The way I need my script to read will not allow for picking up part of
a file then picking the rest of the file up later, unless by later you
guys mean later like a couple of ms later.  My script would need to do
the following things:

1. Check for the existance of a file.
2. Download the file.
3. Run the file through a parsing program.
4. Import the parsed file into a database.
5. Query the database and email the results to a recipient list.
(Quantity and dollar totals of the downloaded file.)
6. Export the file into a Samba directory so my Windows box can pick
it up and process it through some canned software then place the
result file back into the Samba directory.
7. Use php to convert the result file into seperate pdf images.
8. Place the pdf images into a directory and index the pdf file list
into a web-enabled database so users can log into my website and view
customers bills (the pdfs) online.

On some small files this process would be almost instanateous.  But on
larger files the download might take a while so I didnt want my script
getting part of the file and starting the next step before the
download was complete.  In the past instance I processed a file of
80,000 records after I had only downloaded about 60,000.  Would rsync
or ncftpget still work in this situation?  I am deleteing the files
after I download.  Downloading a second file behind the first sounds
like a good idea. I'd just have to have the script check that
condition before continuing.  How would I go about checking the file
size on the remote machine before download  to detect things like
failed transfers, etc... ?

Thanks,
Mark.


On Apr 7, 2005 2:23 PM, Daniel Zhang <zhang at clinicaltools.com> wrote:
> rsync is really a good choice for your task.  It works perfectly on
> Linux machines. You can even install rsync and ssh on your windows machine.
> It could copy only the differences of the file between your server and
> client if the same file already exists.  Furthermore, you can write a
> cron script to run rsync between server and client without bothering of
> any userid and password authentication , if proper set-up being completed.
> 
> Some useful links (more links can be found at the second link below):
> 
> http://www.jdmz.net/ssh/
> http://optics.ph.unimelb.edu.au/help/rsync/
> 
> Good luck!
> 
> Daniel Z
> 
> 
> Mark Freeze wrote:
> 
> >A year or so ago I had a problem downloading a file via ftp onto a
> >Windows box with WS_FTP.  The file was about 100MB and I started
> >downloading the file while my customer was still uploading, so I only
> >got about half of the file.  WS_FTP allowed me to do this with no
> >error. (Which I thought was kinda crazy.)
> >
> >Now I have an offsite ftp spot that my customers use to send me files
> >at random times during the day. I want to automatically download and
> >process these files onto my box as soon as they appear on the site so
> >I was thinking that I would scehedule up a cron job to look for these
> >files every 10 min. When I do this am I going to have the problem of
> >seeing the file and trying to get it as they are uploading?  Some of
> >these files are over 100MB and might take my customer a while to
> >upload.  Someone told me to make sure that I have exclusive access to
> >the file before I download it, but since I have no control over the
> >ftp server I'm not sure on how to accomplish that task.
> >
> >Any help will be greatly appreciated.
> >
> >Regards,
> >Mark.
> >
> >
> 
> --
> TriLUG mailing list        : http://www.trilug.org/mailman/listinfo/trilug
> TriLUG Organizational FAQ  : http://trilug.org/faq/
> TriLUG Member Services FAQ : http://members.trilug.org/services_faq/
> TriLUG PGP Keyring         : http://trilug.org/~chrish/trilug.asc
>



More information about the TriLUG mailing list