[TriLUG] perl: keeping blank when split'ing string on a blank

Paul Bennett paul.w.bennett at gmail.com
Sun Aug 16 14:44:47 EDT 2009


On Sun, 16 Aug 2009 14:25:20 -0400, Joseph Mack NA3T <jmack at wm7d.net>  
wrote:

> I'm writing a webform to querying a database. Let's say I want all books  
> with the string "Tom Sawyer" and the string "adventures" in the title.  
> The user will enter
>
> "Tom Sawyer" adventures
>
> in the text box and the cgi will get this string as a parameter. To  
> query the database I need to separate "adventures" and "Tom Sawyer" but  
> not split "Tom Sawyer".
>
> Running split (/ /, $param_title) gives me 3 words rather than the 2  
> words (strings) I want.
>
> Any ideas?

If you know for sure the user's going to be entering balanced quotes, you  
can use something like s/("\S*?)(\s+)(\S*?")/\1~~~\3/g before you split,  
and s/~~~/ /g on each splitted element afterwards.

Alternatively, you could walk your splitted array, manually joining quoted  
strings, something like this:

my $search='"Tom Sawyer" adventures';

my @result;

my $quoted = 0;
for my $word (split /\s+/, $search) {

     if ($word =~ /^"/) {
         $quoted = 1;
     }

     if ($quoted) {
         $result[-1] .= ' ' . $word;
     }
     else {
         push @result, $word;
     }

     if ($word =~ /"$/) {
         $quoted = 0;
     }
}

Again, that relies on your user always entering balanced quotes, though.  
Thankfully, counting '"' symbols is a lot easier than parsing out balanced  
paren-like symbols.



--
Paul




More information about the TriLUG mailing list