2017 © Pedro Peláez
 

library crawler

Library to extract meaningful informations out of a webpage

image

innmind/crawler

Library to extract meaningful informations out of a webpage

  • Friday, February 16, 2018
  • by Baptouuuu
  • Repository
  • 1 Watchers
  • 1 Stars
  • 270 Installations
  • HTML
  • 2 Dependents
  • 0 Suggesters
  • 0 Forks
  • 0 Open issues
  • 21 Versions
  • 16 % Grown

The README.md

Crawler

Build Status codecov Type Coverage, (*1)

This tool allows you to extract a lot of useful informations out of a web page (may it be html, an image, or anything else)., (*2)

Installation

composer require innmind/crawler

Usage

use function Innmind\Crawler\bootstrap;
use Innmind\OperatingSystem\Factory;
use Innmind\UrlResolver\UrlResolver;
use Innmind\Url\Url;
use Innmind\Http\{
    Message\Request\Request,
    Message\Method\Method,
    ProtocolVersion,
};
use function Innmind\Html\bootstrap as reader;

$os = Factory::build();

$crawl = bootstrap(
    $os->remote()->http(),
    $os->clock(),
    reader(),
    new UrlResolver
);

$resource = $crawl(
    new Request(
        Url::of('https://en.wikipedia.org/wiki/H2g2'),
        new Method('GET'),
        new ProtocolVersion(2, 0),
    ),
);

Here $resource is an instance of HttpResource., (*3)

The Versions