2017 © Pedro Peláez
 

application goutte

A simple PHP Web Scraper

image

socloz/goutte

A simple PHP Web Scraper

  • Tuesday, June 11, 2013
  • by noisebynorthwest
  • Repository
  • 12 Watchers
  • 0 Stars
  • 17,057 Installations
  • PHP
  • 1 Dependents
  • 0 Suggesters
  • 831 Forks
  • 0 Open issues
  • 5 Versions
  • 5 % Grown

The README.md

Goutte, a simple PHP Web Scraper

Goutte is a screen scraping and web crawling library for PHP., (*1)

Goutte provides a nice API to crawl websites and extract data from the HTML/XML responses., (*2)

Requirements

Goutte works with PHP 5.3.3 or later., (*3)

Installation

Installing Goutte is as easy as it can get. Download the [Goutte.phar][1] file and you're done!, (*4)

Usage

Require the Goutte phar file to use Goutte in a script:, (*5)

require_once '/path/to/goutte.phar';

Create a Goutte Client instance (which extends Symfony\Component\BrowserKit\Client):, (*6)

use Goutte\Client;

$client = new Client();

Make requests with the request() method:, (*7)

$crawler = $client->request('GET', 'http://www.symfony-project.org/');

The method returns a Crawler object (Symfony\Component\DomCrawler\Crawler)., (*8)

Click on links:, (*9)

$link = $crawler->selectLink('Plugins')->link();
$crawler = $client->click($link);

Submit forms:, (*10)

$form = $crawler->selectButton('sign in')->form();
$crawler = $client->submit($form, array('signin[username]' => 'fabien', 'signin[password]' => 'xxxxxx'));

Extract data:, (*11)

$nodes = $crawler->filter('.error_list');
if ($nodes->count())
{
  die(sprintf("Authentication error: %s\n", $nodes->text()));
}

printf("Nb tasks: %d\n", $crawler->filter('#nb_tasks')->text());

More Information

Read the documentation of the BrowserKit and DomCrawler Symfony Components for more information about what you can do with Goutte., (*12)

Technical Information

Goutte is a thin wrapper around the following fine PHP libraries:, (*13)

  • Symfony Components: BrowserKit, ClassLoader, CssSelector, DomCrawler, Finder, and Process, (*14)

  • Guzzle, (*15)

License

Goutte is licensed under the MIT license., (*16)

The Versions