程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
 程式師世界 >> 編程語言 >> 網頁編程 >> PHP編程 >> 關於PHP編程 >> 使用php simple html dom parser解析html標簽

使用php simple html dom parser解析html標簽

編輯:關於PHP編程

    使用php simple html dom parser解析html標簽 用了一下 PHP Simple HTML DOM Parser  解析HTML頁面,感覺還不錯,它能創建一個DOM tree方便你解析html裡面的內容。用來抓東西挺好的。   附帶一個例子,你也到sourceforge下載壓縮包看裡面的例子: Scraping data with PHP Simple HTML DOM Parser    PHP Simple HTML DOM Parser , written in PHP5+, allows you to manipulate HTML in a very easy way. Supporting invalid HTML, this parser is better then other PHP scripts using complicated regexes to extract information from web pages. Before getting the necessary info, a DOM should be created from either URL or file. The following script extracts links & images from a website: view plain copy to clipboard print ?    Php代碼 // Create DOM from URL or file     $html = file_get_html('http://www.microsoft.com/');         // Extract links     foreach($html->find('a') as $element)            echo $element->href . '<br>';          // Extract images     foreach($html->find('img') as $element)            echo $element->src . '<br>';   [php]  // Create DOM from URL or file    $html = file_get_html('http://www.microsoft.com/');   // Extract links    foreach($html->find('a') as $element)          echo $element->href . '<br>';    // Extract images    foreach($html->find('img') as $element)          echo $element->src . '<br>';     // Create DOM from URL or file $html = file_get_html('http://www.microsoft.com/'); // Extract links foreach($html->find('a') as $element)        echo $element->href . '<br>';  // Extract images foreach($html->find('img') as $element)        echo $element->src . '<br>'; The parser can also be used to modify HTML elements: view plain copy to clipboard print ?    Php代碼 // Create DOM from string     $html = str_get_html('<div id="simple">Simple</div><div id="parser">Parser</div>');         $html->find('div', 1)->class = 'bar';         $html->find('div[id=simple]', 0)->innertext = 'Foo';         // Output: <div id="simple">Foo</div><div id="parser" class="bar">Parser</div>     echo $html;   [php]  // Create DOM from string    $html = str_get_html('<div id="simple">Simple</div><div id="parser">Parser</div>');   $html->find('div', 1)->class = 'bar';   $html->find('div[id=simple]', 0)->innertext = 'Foo';   // Output: <div id="simple">Foo</div><div id="parser" class="bar">Parser</div>    echo $html;     // Create DOM from string $html = str_get_html('<div id="simple">Simple</div><div id="parser">Parser</div>'); $html->find('div', 1)->class = 'bar'; $html->find('div[id=simple]', 0)->innertext = 'Foo'; // Output: <div id="simple">Foo</div><div id="parser" class="bar">Parser</div> echo $html; Do you wish to retrieve content without any tags? view plain copy to clipboard print ?    Php代碼 echo file_get_html('http://www.yahoo.com/')->plaintext;   [php]  echo file_get_html('http://www.yahoo.com/')->plaintext;     echo file_get_html('http://www.yahoo.com/')->plaintext;In the package files of this parser ([url]http://simplehtmldom.sourceforge.net/[/url]) you can find some scraping examples from digg, imdb, slashdot. Let’s create one that extracts the first 10 results (titles only) for the keyword “php” from Google: view plain copy to clipboard print ?    Php代碼 $url = 'http://www.google.com/search?hl=en&q=php&btnG=Search';         // Create DOM from URL     $html = file_get_html($url);         // Match all 'A' tags that have the class attribute equal with 'l'     foreach($html->find('a[class=l]') as $key => $info)     {     echo ($key + 1).'. '.$info->plaintext."<br />\n";     }   [php]  $url = 'http://www.google.com/search?hl=en&q=php&btnG=Search';   // Create DOM from URL    $html = file_get_html($url);   // Match all 'A' tags that have the class attribute equal with 'l'    foreach($html->find('a[class=l]') as $key => $info)   {   echo ($key + 1).'. '.$info->plaintext."<br />\n";   }     $url = 'http://www.google.com/search?hl=en&q=php&btnG=Search'; // Create DOM from URL $html = file_get_html($url); // Match all 'A' tags that have the class attribute equal with 'l' foreach($html->find('a[class=l]') as $key => $info) { echo ($key + 1).'. '.$info->plaintext."<br />\n"; }NOTE Make sure to include the parser before using any functions of it: view plain copy to clipboard print ?  Php代碼  include 'simple_html_dom.php';   [php]  include 'simple_html_dom.php';     include 'simple_html_dom.php';For more information regarding the usage of this function consider checking the ‘PHP Simple HTML Dom Parser’ Manual. To download the package files use the following URL: [url] 分享到: 

  1. 上一頁:
  2. 下一頁:
Copyright © 程式師世界 All Rights Reserved