开发者

PHP Word Crawler

开发者 https://www.devze.com 2023-01-20 05:53 出处:网络
How开发者_StackOverflow社区 to get all unique words from a webpage in an array? (without all attributes and javascript etc.)?

How开发者_StackOverflow社区 to get all unique words from a webpage in an array? (without all attributes and javascript etc.)?

Could anybody help me with this?


Have a look at http://simplehtmldom.sourceforge.net/

Then do something like:

<?php

include_once('simplehtmldom/simple_html_dom.php');

$string = file_get_html('http://www.google.com')->plaintext;
$words = preg_split('/[\s,.]+/', $string, null, PREG_SPLIT_NO_EMPTY);

var_dump(array_unique($words));

?>


try this get_text this one will help you: http://mel.melaxis.com/devblog/2005/08/06/localizing-php-web-sites-using-gettext/

0

精彩评论

暂无评论...
验证码 换一张
取 消