Asynchronous cURL executor simply based on resource and Generator, (*1)
| PHP |
:question: |
Feature Restriction |
| 7.0~ |
:smile: |
Full Support |
| 5.5~5.6 |
:anguished: |
Generator is not so cool |
| ~5.4 |
:boom: |
Incompatible |
function curl_init_with(string $url, array $options = [])
{
$ch = curl_init();
$options = array_replace([
CURLOPT_URL => $url,
CURLOPT_RETURNTRANSFER => true,
], $options);
curl_setopt_array($ch, $options);
return $ch;
}
function get_xpath_async(string $url) : \Generator
{
$dom = new \DOMDocument;
@$dom->loadHTML(yield curl_init_with($url));
return new \DOMXPath($dom);
}
var_dump(Co::wait([
'Delay 5 secs' => function () {
echo "[Delay] I start to have a pseudo-sleep in this coroutine for about 5 secs\n";
for ($i = 0; $i < 5; ++$i) {
yield Co::DELAY => 1;
if ($i < 4) {
printf("[Delay] %s\n", str_repeat('.', $i + 1));
}
}
echo "[Delay] Done!\n";
},
"google.com HTML" => curl_init_with("https://google.com"),
"Content-Length of github.com" => function () {
echo "[GitHub] I start to request for github.com to calculate Content-Length\n";
$content = yield curl_init_with("https://github.com");
echo "[GitHub] Done! Now I calculate length of contents\n";
return strlen($content);
},
"Save mpyw's Gravatar Image URL to local" => function () {
echo "[Gravatar] I start to request for github.com to get Gravatar URL\n";
$src = (yield get_xpath_async('https://github.com/mpyw'))
->evaluate('string(//img[contains(@class,"avatar")]/@src)');
echo "[Gravatar] Done! Now I download its data\n";
yield curl_init_with($src, [CURLOPT_FILE => fopen('/tmp/mpyw.png', 'wb')]);
echo "[Gravatar] Done! Saved as /tmp/mpyw.png\n";
}
]));
The requests are executed as parallelly as possible :smile:
Note that there is only 1 process and 1 thread., (*2)
[Delay] I start to have a pseudo-sleep in this coroutine for about 5 secs
[GitHub] I start to request for github.com to calculate Content-Length
[Gravatar] I start to request for github.com to get Gravatar URL
[Delay] .
[Delay] ..
[GitHub] Done! Now I calculate length of contents
[Gravatar] Done! Now I download its data
[Delay] ...
[Gravatar] Done! Saved as /tmp/mpyw.png
[Delay] ....
[Delay] Done!
array(4) {
["Delay 5 secs"]=>
NULL
["google.com HTML"]=>
string(262) "<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="https://www.google.co.jp/?gfe_rd=cr&ei=XXXXXX">here</A>.
</BODY></HTML>
"
["Content-Length of github.com"]=>
int(25534)
["Save mpyw's Gravatar Image URL to local"]=>
NULL
}
Table of Contents
Installing
Install via Composer., (*3)
composer require mpyw/co:^1.5
And require Composer autoloader in your scripts., (*4)
require __DIR__ . '/vendor/autoload.php';
use mpyw\Co\Co;
use mpyw\Co\CURLException;
API
Co::wait()
Wait for all the cURL requests to complete.
The options will override static defaults., (*5)
static Co::wait(mixed $value, array $options = []) : mixed
Arguments
-
(mixed) __$value__
Any values to be parallelly resolved.
-
(array<string, mixed>) __$options__
Associative array of options.
| Key |
Default |
Description |
throw |
true |
Whether to throw or capture CURLException or RuntimeException on top-level. |
pipeline |
false |
Whether to use HTTP/1.1 pipelining. At most 5 requests for the same destination are bundled into single TCP connection. |
multiplex |
true |
Whether to use HTTP/2 multiplexing. All requests for the same destination are bundled into single TCP connection. |
autoschedule |
false |
Whether to use automatic scheduling by CURLMOPT_MAX_TOTAL_CONNECTIONS. |
interval |
0.002 |
curl_multi_select() timeout seconds. 0 means real-time observation. |
concurrency |
6 |
Limit of concurrent TCP connections. 0 means unlimited. The value should be within 10 at most. |
-
Throwable which are not extended from RuntimeException, such as Error Exception LogicException are not captured. If you need to capture them, you have to write your own try-catch blocks in your functions.
- HTTP/1.1 pipelining can be used only if the TCP connection is already established and verified that uses keep-alive session. It means that the first bundle of HTTP/1.1 requests CANNOT be pipelined. You can use it from second
yield in Co::wait() call.
- To use HTTP/2 multiplexing, you have to build PHP with libcurl 7.43.0+ and
--with-nghttp2.
- To use
autoschedule, PHP 7.0.7 or later is required.
When autoschedule Disabled:, (*6)
-
curl_multi_add_handle() call can be delayed.
-
concurrency controlling with pipeline / multiplex CANNOT be correctly driven. You should set higher concurrency in those cases.
When autoschedule Enabled:, (*7)
-
curl_multi_add_handle() is always immediately called.
-
CURLINFO_TOTAL_TIME CANNOT be correctly calculated. "Total Time" includes the time waiting for other requests are finished.
The details of CURLIFNO_*_TIME timing charts are described at the bottom of this page., (*8)
Return Value
(mixed)
Resolved values; in exception-safe context, it may contain..., (*9)
-
CURLException which has been raised internally.
-
RuntimeException which has been raised by user.
Exception
- Throws
CURLException or RuntimeException in exception-unsafe context.
Co::async()
Execute cURL requests along with Co::wait() call, without waiting resolved values.
The options are inherited from Co::wait()., (*10)
This method is mainly expected to be used ..., (*11)
- When you are not interested in responses.
- In
CURLOPT_WRITEFUNCTION or CURLOPT_HEADERFUNCTION callbacks.
static Co::async(mixed $value, mixed $throw = null) : null
Arguments
-
(mixed) __$value__
Any values to be parallelly resolved.
-
(mixed) __$throw__
Overrides throw in Co::wait() options when you passed true or false.
Return Value
(null), (*12)
Exception
-
CURLException or RuntimeException can be thrown in exception-unsafe context.
Note that you CANNOT capture top-level exceptions unless you catch outside of Co::wait() call.
Co::isRunning()
Return if Co::wait() is running().
With this check, you can safely call Co::wait() or Co::async()., (*13)
static Co::isRunning() : bool
Co::any()
Co::race()
Co::all()
Return a Generator that resolves with specific value., (*14)
static Co::any(array $value) : \Generator<mixed>
static Co::race(array $value) : \Generator<mixed>
static Co::all(array $value) : \Generator<mixed>
| Family |
Return Value |
Exception |
Co::any() |
First Success |
AllFailedException |
Co::race() |
First Success |
First Failure |
-
Jobs CANNOT be canceled.
Incomplete jobs remain even if Co::any() or Co::race() is resolved.
-
Co::all(...) is just a wrapper of (function () { return yield ...; })().
It should be only used with Co::race() or Co::any().
Co::wait(function () {
$group1 = Co::all([$ch1, $ch2, $ch3]);
$group2 = Co::all([$ch4, $ch5, $ch6]);
$group1or2 = Co::any([$group1, $group2]);
var_dump(yield $group1or2);
});
Co::setDefaultOptions()
Co::getDefaultOptions()
Overrides/gets static default settings., (*15)
static Co::setDefaultOptions(array $options) : null
static Co::getDefaultOptions() : array
Rules
Conversion on Resolving
The all yielded/returned values are resolved by the following rules.
Yielded values are also resent to the Generator.
The rules will be applied recursively., (*16)
| Before |
After |
| cURL resource |
curl_multi_getconent() result or CURLException
|
| Array |
Array (with resolved children) or RuntimeException
|
Generator Closure Generator |
Return value (after all yields done) or RuntimeException
|
"Generator Closure" means Closure that contains yield keywords., (*17)
Exception-safe or Exception-unsafe Priority
Context in Generator
Exception-unsafe context by default.
The following yield statement specifies exception-safe context., (*18)
$results = yield Co::SAFE => [$ch1, $ch2];
This is equivalent to:, (*19)
$results = yield [
function () use ($ch1) {
try {
return yield $ch1;
} catch (\RuntimeException $e) {
return $e;
}
},
function () use ($ch2) {
try {
return yield $ch2;
} catch (\RuntimeException $e) {
return $e;
}
},
];
Context on Co::wait()
Exception-unsafe context by default.
The following setting specifies exception-safe context., (*20)
$result = Co::wait([$ch1, $ch2], ['throw' => false]);
This is equivalent to:, (*21)
$results = Co::wait([
function () use ($ch1) {
try {
return yield $ch1;
} catch (\RuntimeException $e) {
return $e;
}
},
function () use ($ch2) {
try {
return yield $ch2;
} catch (\RuntimeException $e) {
return $e;
}
},
]);
Context on Co::async()
Contexts are inherited from Co::wait().
The following setting overrides parent context as exception-safe., (*22)
Co::async($value, false);
The following setting overrides parent context as exception-unsafe., (*23)
Co::async($value, true);
Pseudo-sleep for Each Coroutine
The following yield statements delay the coroutine processing:, (*24)
yield Co::DELAY => $seconds
yield Co::SLEEP => $seconds # Alias
Comparison with Generators of PHP7.0+ or PHP5.5~5.6
return Statements
PHP 7.0+:, (*25)
yield $foo;
yield $bar;
return $baz;
PHP 5.5~5.6:, (*26)
yield $foo;
yield $bar;
yield Co::RETURN_WITH => $baz;
Although experimental aliases Co::RETURN_ Co::RET Co::RTN are provided,
Co::RETURN_WITH is recommended in terms of readability., (*27)
yield Statements with Assignment
PHP 7.0+:, (*28)
$a = yield $foo;
echo yield $bar;
PHP 5.5~5.6:, (*29)
$a = (yield $foo);
echo (yield $bar);
finally Statements
Be careful that return triggers finally while yield Co::RETURN_WITH => does not., (*30)
try {
return '...';
} finally {
// Reachable
}
try {
yield Co::RETURN_WITH => '...';
} finally {
// Unreachable
}
Appendix
Timing Charts
Note that S is equal to Q when autoschedule is disabled., (*31)
Basic
| ID |
When |
| Q |
curl_multi_exec() immediately after curl_multi_add_handle() called |
| S |
Processing started actually |
| DNS |
DNS resolution completed |
| TCP |
TCP connection established |
| TLS |
TLS/SSL session established |
| HS |
All HTTP request headers sent |
| BS |
Whole HTTP request body sent |
| HR |
All HTTP response headers received |
| BR |
Whole HTTP response body received |
| Constant |
Time |
| CURLINFO_NAMELOOKUP_TIME |
DNS - S |
| CURLINFO_CONNECT_TIME |
TCP - S |
| CURLINFO_APPCONNECT_TIME |
TLS - S |
| CURLINFO_PRETRANSFER_TIME |
HS - S |
| CURLINFO_STARTTRANSFER_TIME |
HR - S |
| CURLINFO_TOTAL_TIME |
BR - Q |
With Redirections by CURLOPT_FOLLOWLOCATION
| ID |
When |
| Q |
curl_multi_exec() immediately after curl_multi_add_handle() called |
| S |
Processing started actually |
| DNS(1) |
DNS resolution completed |
| TCP(1) |
TCP connection established |
| TLS(1) |
TLS/SSL session established |
| HS(1) |
All HTTP request headers sent |
| BS(1) |
Whole HTTP request body sent |
| HR(1) |
All HTTP response headers received |
| DNS(2) |
DNS resolution completed |
| TCP(2) |
TCP connection established |
| TLS(2) |
TLS/SSL session established |
| HS(2) |
All HTTP request headers sent |
| BS(2) |
Whole HTTP request body sent |
| HR(2) |
All HTTP response headers received |
| BR(2) |
Whole HTTP response body received |
| Constant |
Time |
| CURLINFO_REDIRECT_TIME |
HR(1) - Q |
| CURLINFO_NAMELOOKUP_TIME |
DNS(2) - HR(1) |
| CURLINFO_CONNECT_TIME |
TCP(2) - HR(1) |
| CURLINFO_APPCONNECT_TIME |
TLS(2) - HR(1) |
| CURLINFO_PRETRANSFER_TIME |
HS(2) - HR(1) |
| CURLINFO_STARTTRANSFER_TIME |
HR(2) - HR(1) |
| CURLINFO_TOTAL_TIME |
BR(2) - Q |