[TriLUG] Elim. unwanted characters
mfreeze at gmail.com
Tue Mar 14 11:22:11 EST 2006
I'd like to solicit ideas for the best way to handle the following problem:
I receive 10 - 20 text files per day from various sources. Several of
the files that I receive are from mainframe shops. (Who can't change
their processes for anything.) When I receive the files, certain
sections are loaded with null characters instead of spaces. Since I
currently transfer these files to my PC for manual processing, I have
been opening the files in UltraEdit, viewing the hex, and replacing
(00) with (20).
Does anyone know of a utility that I could use to automate this
process? Maybe use cron to look for files in a certain directory and
then run a program to do a 'search and replace' for these characters?
If I were going to write my own utility in C++, what would be the
quickest way to read in chunks of data and then do the search and
replace? Character by character is slow as some of these files are 80
- 100 MB.
I'm just getting started on this process so any suggestions are welcome.
More information about the TriLUG