开发者

PHP Regex help - parsing sets of data based on format

开发者 https://www.devze.com 2023-02-15 03:45 出处:网络
I\'m developing a forum rugby style score game and looking for help developing a regx parser to parse the sets of games.

I'm developing a forum rugby style score game and looking for help developing a regx parser to parse the sets of games.

Each post could have the possible below formats (difference is some people may use a comma to break up games and also some may hypernate the score - or any combinatio of the two):

TEAMA 25-31 TEAMB TEAMC 28-35 TEAMD TEAME 38-10 TEAMF TEAMG 21-15 TEAMH

.

TEAMA 25 31 TEAMB TEAMC 28 35 TEAMD TEAME 38 10 TEAMF TEAMG 21 15 TEAMH

.

TEAMA 25-31 TEAMB, TEAMC 28-35 TEAMD, TEAME 38-10 TEAMF, TEAMG 21-15 TEAMH

.

TEAMA 25 31 TEAMB, TEAMC 28 35 TEAMD, TEAME 38 10 TEAMF, TEAMG 21 15 TEAMH

Basically the teams are always expected to be 5 characters long and the score sat in between the two teams but there may not necessarily always be the same amount of games in an individual post, i.e. one post could be one game or 20. There could also be extra text before or after but still need to be able to pluck out the games. Just need each game to be split out i.e. [TEAMA] [SCORE] [SCORE] [TEAMB] would be considered one game.

I started to use explode but didn't have much开发者_JAVA技巧 luck and unfortunately don't have much regx experience so looking for a flexible way to accomodate the above - just need each game to be split out.

Any help appreciated.


It's easier to match each result than to split them, e.g.:

preg_match_all('/(?P<teamA>\w{5})\s+(?P<scoreA>\d+)[\s-](?P<scoreB>\d+)\s+(?P<teamB>\w{5})/', $str, $m, PREG_SET_ORDER);
print_r($m);

Gives you for each result, something like:

[0] => Array
    (
        [0] => TEAMA 25 31 TEAMB
        [teamA] => TEAMA
        [1] => TEAMA
        [scoreA] => 25
        [2] => 25
        [scoreB] => 31
        [3] => 31
        [teamB] => TEAMB
        [4] => TEAMB
    )


You could try a regular expression like this (assumes team names are alphanumeric)

([a-zA-Z0-9]{5})\s+(\d+)[\s-](\d+)\s+([a-zA-Z0-9]{5})

http://rubular.com/r/v4HGNzo3UY


an alternative,

    $raw_str = "TEAMA 25-31 TEAMB TEAMC 28-35 TEAMD TEAME 38-10 TEAMF TEAMG 21-15 TEAMH";
preg_match_all('/(?<first_team_name>[A-Z]+)\s+(?<first_team_score>[0-9]+)-(?<second_team_score>[0-9]+)\s+(?<second_team_name>[A-Z]+)/i',$raw_str,$matches);
$scores = array();
foreach($matches[0] as $index => $match)
{
    $scores[] = array(
                    'first_team_name' =>  $matches['first_team_name'][$index],
                    'first_team_score' =>  $matches['first_team_score'][$index],
                    'second_team_name' =>  $matches['second_team_name'][$index],
                    'second_team_score' =>  $matches['second_team_score'][$index]
                    );
}

print_r($scores);

Output:

Array ( [0] => Array ( [first_team_name] => TEAMA [first_team_score] => 25 [second_team_name] => TEAMB [second_team_score] => 31 )

[1] => Array
    (
        [first_team_name] => TEAMC
        [first_team_score] => 28
        [second_team_name] => TEAMD
        [second_team_score] => 35
    )

[2] => Array
    (
        [first_team_name] => TEAME
        [first_team_score] => 38
        [second_team_name] => TEAMF
        [second_team_score] => 10
    )

[3] => Array
    (
        [first_team_name] => TEAMG
        [first_team_score] => 21
        [second_team_name] => TEAMH
        [second_team_score] => 15
    )

)


The regex is: [A-Z]{5}\s\d+[\s\-]\d+\s[A-Z]{5}

Here is a working example of usage in perl:

my $content = "TEAMA 25-31 TEAMB TEAMC 28 35 TEAMD TEAME 38-10 TEAMF TEAMG 21-15 TEAMH";
my (@scores) = $content =~ m![A-Z]{5}\s\d+[\s\-]\d+\s[A-Z]{5}!g;

foreach my $score (@scores) {
  print "$score\n";
}

The output is:

TEAMA 25-31 TEAMB

TEAMC 28 35 TEAMD

TEAME 38-10 TEAMF

TEAMG 21-15 TEAMH

If you like anycase of team name you can use [A-Za-z] insead of [A-Z]

0

精彩评论

暂无评论...
验证码 换一张
取 消