开发者

White Space in Key of Associative Array PHP

开发者 https://www.devze.com 2023-01-04 20:36 出处:网络
I\'m parsing out an HTML table and building an array based on the row values.My problem is the associative keys that are returned have a bit of white space at the end of them giving me results like th

I'm parsing out an HTML table and building an array based on the row values. My problem is the associative keys that are returned have a bit of white space at the end of them giving me results like this:

Array ( [Count  ] => 6  开发者_开发百科 [Class  ] => 30c   [Description] => Conformation Model (Combined 30,57) )

So a line like this:

echo $myArray['Count'];

or

echo $myArray['Count '];

Gives me a blank result.

for now I've got a pretty hacky work around going...

foreach($myArray as $row){

    $count = 0;
    foreach($row as $info){
        if($count == 0){
            echo 'Count:' . $info;
            echo '<br>';
        }
        if($count == 1){
            echo ' Class:' . $info;
            echo '<br>';
        }
        if($count == 2){
            echo ' Description:' . $info;
            echo '<br>';
        }
        $count++;
    }

}

The function I'm using to parse the table I found here:

function parseTable($html)
{
  // Find the table
  preg_match("/<table.*?>.*?<\/[\s]*table>/s", $html, $table_html);

  // Get title for each row
  preg_match_all("/<th.*?>(.*?)<\/[\s]*th>/", $table_html[0], $matches);
  $row_headers = $matches[1];

  // Iterate each row
  preg_match_all("/<tr.*?>(.*?)<\/[\s]*tr>/s", $table_html[0], $matches);

  $table = array();

  foreach($matches[1] as $row_html)
  {
    preg_match_all("/<td.*?>(.*?)<\/[\s]*td>/", $row_html, $td_matches);
    $row = array();
    for($i=0; $i<count($td_matches[1]); $i++)
    {
      $td = strip_tags(html_entity_decode($td_matches[1][$i]));
      $row[$row_headers[$i]] = $td;
    }

    if(count($row) > 0)
      $table[] = $row;
  }
  return $table;
}

I'm assuming I can eliminate the white space by updating with the correct regex expression, but, of course I avoid regex like the plague. Any ideas? Thanks in advance. -J


You can use trim to remove leading and trailing whitespace characters:

$row[trim($row_headers[$i])] = $td;

But don’t use regular expressions for parsing the HTML document; use a proper HTML parser like the Simple HTML DOM Parser or the one of DOMDocument instead.


An easy solution would be to change

$row[$row_headers[$i]] = $td;

to:

$row[trim($row_headers[$i])] = $td;
0

精彩评论

暂无评论...
验证码 换一张
取 消