[TriLUG] Need some help parsing a file

William Sutton william at trilug.org
Mon Dec 30 06:44:30 EST 2013


Well, at least let me shorten your revision up :-)

cat filename.txt |perl -pe "s/^.+\s{3}[\d,]+\s//"

Of course, the original solution using ls to generate a bare list of 
filenames would be preferable to all this post-listing regex magic.

William Sutton

On Mon, 30 Dec 2013, Peter Neilson wrote:

> Your regex fails if any filename starts with one or more digits followed by 
> spaces. The digits and the ensuing spaces will be gobbled greedily.
>
> If instead we aim at seeing at least three spaces between the time-of-day and 
> the size of the file, and a single space following the size of the file, then 
> this might work:
>
> cat filename.txt |perl -pe "s/^.+\s\s\s[\d,]+\s//"
>
> Testing with a filename of " 4    his is filename 2.sh" and a stripped 
> leading zero on the time-of-day I get this:
>
> $ cat filename.txt
> 11/09/2013  11:49 AM         7,887,098 this is filename 1.txt
> 11/05/2013   8:09 PM        11,652,690  4    his is filename 2.sh
> $ cat filename.txt |perl -pe "s/^.+\s\s\s[\d,]+\s//"
> this is filename 1.txt
> 4    his is filename 2.sh
> $
>
> My example allows the notion that a filenames might stupidly start with a 
> space, something that IS legal in Unix-like systems. I restricted the regex 
> to look for a single space after the size of the file. There are still some 
> failing cases where the filename starts with TWO (or more) spaces. *SIGH*
>
> On Sun, 29 Dec 2013 22:48:18 -0500, William Sutton <william at trilug.org> 
> wrote:
>
>> still relies on file listings starting in column 40.  You really want 
>> awk+sed for this sort of thing, but I never learned awk.  So...
>> 
>> cat filename.txt |perl -pe "s/^.+[\d,]+\s+//"
>> 
>> :-)
>> 
>> William Sutton
>> 
>> On Sun, 29 Dec 2013, Peter Neilson wrote:
>> 
>>> On Sun, 29 Dec 2013 21:13:17 -0500, Dewey Hylton <plug at hyltown.com> wrote:
>>> 
>>>> try this:
>>>> cut -c40- < filename.txt
>>> 
>>> Doesn't use sed, but it's better than mine.
>>> -- This message was sent to: William <william at trilug.org>
>>> To unsubscribe, send a blank message to trilug-leave at trilug.org from that 
>>> address.
>>> TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug
>>> Unsubscribe or edit options on the web	: 
>>> http://www.trilug.org/mailman/options/trilug/william%40trilug.org
>>> Welcome to TriLUG: http://trilug.org/welcome


More information about the TriLUG mailing list