[TriLUG] Need some help parsing a file
Peter Neilson
neilson at windstream.net
Mon Dec 30 06:16:00 EST 2013
Your regex fails if any filename starts with one or more digits followed
by spaces. The digits and the ensuing spaces will be gobbled greedily.
If instead we aim at seeing at least three spaces between the time-of-day
and the size of the file, and a single space following the size of the
file, then this might work:
cat filename.txt |perl -pe "s/^.+\s\s\s[\d,]+\s//"
Testing with a filename of " 4 his is filename 2.sh" and a stripped
leading zero on the time-of-day I get this:
$ cat filename.txt
11/09/2013 11:49 AM 7,887,098 this is filename 1.txt
11/05/2013 8:09 PM 11,652,690 4 his is filename 2.sh
$ cat filename.txt |perl -pe "s/^.+\s\s\s[\d,]+\s//"
this is filename 1.txt
4 his is filename 2.sh
$
My example allows the notion that a filenames might stupidly start with a
space, something that IS legal in Unix-like systems. I restricted the
regex to look for a single space after the size of the file. There are
still some failing cases where the filename starts with TWO (or more)
spaces. *SIGH*
On Sun, 29 Dec 2013 22:48:18 -0500, William Sutton <william at trilug.org>
wrote:
> still relies on file listings starting in column 40. You really want
> awk+sed for this sort of thing, but I never learned awk. So...
>
> cat filename.txt |perl -pe "s/^.+[\d,]+\s+//"
>
> :-)
>
> William Sutton
>
> On Sun, 29 Dec 2013, Peter Neilson wrote:
>
>> On Sun, 29 Dec 2013 21:13:17 -0500, Dewey Hylton <plug at hyltown.com>
>> wrote:
>>
>>> try this:
>>> cut -c40- < filename.txt
>>
>> Doesn't use sed, but it's better than mine.
>> -- This message was sent to: William <william at trilug.org>
>> To unsubscribe, send a blank message to trilug-leave at trilug.org from
>> that address.
>> TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug
>> Unsubscribe or edit options on the web :
>> http://www.trilug.org/mailman/options/trilug/william%40trilug.org
>> Welcome to TriLUG: http://trilug.org/welcome
More information about the TriLUG
mailing list