SocialCrawler is a PHP library for retrieving images and videos posts from the most popular social networks if they contains specific hashtags. It currently supports the following social networks:, (*1)
The recommended way to install SocialCrawler is through Composer., (*2)
# Install Composer curl -sS https://getcomposer.org/installer | php
Then add SocialCrawler as a dependency in your composer.json
file., (*3)
"require": { "heavenconseil/socialcrawler": "dev-master" }
After installing, you need to require Composer's autoloader:, (*4)
require __DIR__ . '/vendor/autoload.php';
<?php require __DIR__ . '/vendor/autoload.php'; use SocialCrawler\Crawler; use SocialCrawler\Channel\Channel; $crawler = new Crawler(array( 'channels' => array( 'FacebookChannel' => array( 'id' => 'YOUR_FACEBOOK_APP_ID', 'secret' => 'YOUR_FACEBOOK_APP_SECRET', 'token' => 'YOUR_FACEBOOK_ACCESS_TOKEN', 'media' => Channel::MEDIA_IMAGES_VIDEOS, 'since' => '1391425325' ), 'InstagramChannel' => array( 'id' => 'YOUR_INSTAGRAM_CLIENT_ID', 'media' => Channel::MEDIA_VIDEOS ), 'TwitterChannel' => array( 'id' => 'YOUR_TWITTER_CONSUMER_KEY', 'secret' => 'YOUR_TWITTER_CONSUMER_SECRET', 'media' => Channel::MEDIA_IMAGES ), 'YoutubeChannel' => array( 'media' => Channel::MEDIA_IMAGES ), 'GooglePlusChannel' => array( 'id' => 'YOUR_GOOGLEPLUS_APP_ID', 'media' => Channel::MEDIA_IMAGES ) ), 'log' => array( 'path' => __DIR__, 'level' => Crawler::LOG_VERBOSE ) )); $data = $crawler->fetch('#hashtag'); // Fetch a specific hashtag $data = $crawler->fetch(array('#hashtag1', '#hashtag2')); // Fetch multiple hashtags $data = $crawler->fetch('from:{user}'); // Fetch content from a specific user (Username for Twitter, User ID for Instagram, Google+ and Facebook) $data = $crawler->fetch('user:{user}', true); // Fetch user informations with raw data
SocialCrawler can be initialized with an Array of channels
, each item containing at least an id
property and identified by the name of the class that will handle the operations.
The log
options are optional. By default a socialcrawler.log file will be created in the SocialCrawler directory., (*5)
The output generated by SocialCrawler will look like this:, (*6)
{ 'FacebookChannel': { 'new_since': '...', 'data': [ { 'id': '...', 'created_at': '...', 'description': '...', 'link': '...', 'author': { 'id': '...', 'avatar': '...', 'fullname': '...', 'username': '...' // Will return fullname if username is not set }, 'thumb': '...', 'source': '...', 'type': 'image|video|text', 'raw': { } // The raw data returned by the social network API } ] }, 'InstagramChannel': { }, 'TwitterChannel': { }, 'YoutubeChannel': { }, 'GooglePlusChannel': { } }
You can easily support more services (say, Tumblr for example) by adding a new TumblrChannel
that extends the abstract class Channel
. It should have at least these public methods with the following signatures:, (*7)
__construct($applicationId, $applicationSecret = null, $applicationToken = null) fetch($query, $type, $since = null, $pIncludeRaw = false)
You can benefit from the Guzzle HTTP framework for your Channel as it's already used by SocialCrawler., (*8)