开发者

Regular Expression in PHP: How to create a pattern for tables in html

开发者 https://www.devze.com 2022-12-14 03:52 出处:网络
I am using latest PHP. I want to parse HTML page to get data. HTML: <table class=\"margin15\" style=\"margin-left: 0pt; margin-right: 0pt;\" width=\"100%\" align=\"left\" border=\"0\" cellpaddin

I am using latest PHP. I want to parse HTML page to get data.

HTML:

<table class="margin15" style="margin-left: 0pt; margin-right: 0pt;" width="100%" align="left" border="0" cellpadding="0" cellspacing="0">
TRs, TDs, Data
</table>

<table class="margin15" style="margin-left: 0pt; margin-right: 0pt;" width="100%" align="left" border="0" cellpadding="0" cellspacing="0">
TRs, TDs, Data
</table>

<table class="margin15" style="margin-left: 0pt; margin-right: 0pt;" width="100%" align="left" border="0" cellpadding="0" cellspacing="0">
TRs, TDs, Data
</table>

<table class="margin15" style="margin-left: 0pt; margin-right: 0pt;" width="100%" align="left" border="0" cellpadding="0" cellspacing="0">
TRs, TDs, Data
</table>

PHP Code:

<?php

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://www.test.com/mypage.html');  
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$result = curl_exec($ch);


$pattern = '/<table class="margin15" style="margin-left: 0pt; margin-right: 0pt;" width="100%" align="left" border="1" cellpadding="0" cellspacing="0">[^~]</table>/';
preg_match_all($pattern, $result, $matches);
print_r($matches);

?>

I a开发者_StackOverflow中文版m not able to get all tables. When I use simple $pattern='/table/';, it gives me exact result. How to create a pattern to get whole table at one array location?


Parsing HTML using regex is a pain at best as HTML is not regular, I suggest you use Simple HTML DOM.


You can't parse [X]HTML with regex, but you can try:

$pattern = '#<table(?:.*?)>(.*?)</table>#';

This won't work if there are nested tables.


Please have a look at this answer. It describes the usage of an HTML parser in PHP, which is what you want to do.


Or just use the DOM class php offers. I think it can do the same as simple html dom but much faster (don't' get me wrong, I really like Simple Html DOM, but it's slow for files with a few dozen lines)

0

精彩评论

暂无评论...
验证码 换一张
取 消