开发者

Splitting a String into Tokens and Storing the Delimiters in Perl

开发者 https://www.devze.com 2022-12-13 19:24 出处:网络
I have a string like this: abcd I process my string like this: chomp $line; my @tokens = split /\\s+/, $line;

I have a string like this:

a  b   c       d

I process my string like this:

   chomp $line;
    my @tokens = split /\s+/, $line;
    my @new_tokens;
    foreach my $token (@tokens) {    
        push @new_tokens, some_complex_function( $token );
    }
    my $new_str = join ' ', @tokens;

I'd like to re-join the string with the original whitespace. Is there some way that I can store the whitespace from split and re-use it later? Or is this going to be a hug开发者_JAVA百科e pain? It's mostly cosmetic, but I'd like to preserve the original spaces from the input string.


If you split with a regex with capturing parentheses, the split pattern will be included in the resulting list (see perldoc -f split):

my @list = split /(\s+)/, 'a  b   c       d';
print Data::Dumper::Dumper(\@list);

VAR1 = [
          'a',
          '  ',
          'b',
          '   ',
          'c',
          '       ',
          'd'
        ];


Just split on word boundaries:

split /\b/, $line;

For your example, this will give:

('a','  ','b','   ','c','       ','d')

EDIT: As brian d foy pointed out, \b uses the wrong character classes, Following my original idea, I came up with using look-around assertions. This looks way more complicated than Ether's answer, though:

split /(?:(?<=\S)(?=\s)|(?<=\s)(?=\S))/, $line;


Why don't you simply do: my $new_str = uc( $line ); ?

UPDATE - original uc() is just a shorthand for "more complex function".

Well, generally you can also:

$line =~ s/(\S+)/more_complex_function($1)/ge;
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号