php - Scrape Images, Links and Texts serially using Goutte -


i've bellow code trying take html elements 1 1 serially include tag self without styles , classes. plus, i'm failing images

    $client = new client();      $crawler = $client->request('get', 'http://www.tutorialspoint.com/laravel/laravel_ajax.htm');      $crawler->filter('h1, h2, h3, h4, h5, h6, p, pre, p > img, div > img, p > a')->each(function(crawler $node, $i){         if ($node->filter('p')){             echo $node->text()."<br/>";          } else if ($node->filter('pre')) {             echo '<code>'.$node->html().'</code><br/>';         }     }); 

but whatever do, i'm either getting texts when use $node->text() or html in page when use $node->html() in page.

i'm trying example p - <p>text here</p>. img - <img src="default.jp"/>.

the line $node->filter('p') return true, since returned value of function filter crawler object, second else if never called.
if want check if crawler has nodes in can use count() function.

as code - i'm not sure why doing, code check if current element has <p> child element (is trying do?), , if has - print content of parent's node text.

in order nodes domelement crawler ($node) can use

$node->getnode(0)` 

and using node can check nodename (==tag name), textcontent (the content of tag), etc.

here example can use:

$crawler = $client->request('get', 'http://www.tutorialspoint.com/laravel/laravel_ajax.htm');  $crawler->filter('h1, h2, h3, h4, h5, h6, p, pre, p > img, div > img, p > a')->each(function(crawler  $node, $i){     if (in_array($node->getnode(0)->nodename, ['h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'p', 'a'])) {         echo "{$node->getnode(0)->nodename} => {$node->getnode(0)->textcontent}.<br/>\n";     } elseif ($node->getnode(0)->nodename == 'pre') {         echo "pre => <code>".$node->html()."</code><br/>\n";     } elseif ($node->getnode(0)->nodename == 'img') {         echo 'img => src="'.$node->getnode(0)->getattribute('src')."\" <br/>\n";     } }); 

Comments