Web services often implement pagination in their responses which requires the end-user to issue a series of consecutive requests in order to fetch all of the data they asked for. Users of your web service client should not be responsible for implementing the logic involved in iterating through pages of results. Guzzle provides a simple resource iterator foundation to make it easier on web service client developers to offer a useful abstraction layer.
ResourceIteratorInterface GuzzleServiceClient::getIterator($command [, array $commandOptions, array $iteratorOptions ])
The getIterator
method of a Guzzle\Service\ClientInterface
object provides a convenient interface for
instantiating a resource iterator for a specific command. This method implicitly uses a
Guzzle\Service\Resource\ResourceIteratorFactoryInterface
object to create resource iterators. Pass an
instantiated command object or the name of a command in the first argument. When passing the name of a command, the
command factory of the client will create the command by name using the $commandOptions
array. The third argument
may be used to pass an array of options to the constructor of the instantiated ResourceIteratorInterface
object.
$iterator = $client->getIterator('get_users');
foreach ($iterator as $user) {
echo $user['name'] . ' age ' . $user['age'] . PHP_EOL;
}
The above code sample might execute a single request or a thousand requests. As a consumer of a web service, I don't care. I just want to iterate over all of the users.
The two universal options that iterators should support are limit
and page_size
. Using the limit
option
tells the resource iterator to attempt to limit the total number of iterated resources to a specific amount. Keep in
mind that this is not always possible due to limitations that may be inherent to a web service. The page_size
option is used to tell a resource iterator how many resources to request per page of results. Much like the limit
option, you can not rely on getting back exactly the number of resources your specify in the page_size
option.
Note
The limit
and page_size
options can also be specified on an iterator using the setLimit($limit)
and
setPageSize($pageSize)
methods.
The default resource iterator factory of a client object expects that your iterators are stored under the Model
folder of your client and that an iterator is names after the CamelCase name of a command followed by the word
"Iterator". For example, if you wanted to create an iterator for the get_users
command, then your iterator class
would be Model\GetUsersIterator
and would be stored in Model/GetUsersIterator.php
.
While not required, resource iterators in Guzzle typically iterate using a Guzzle\Service\Command\CommandInterface
object. Guzzle\Service\Resource\ResourceIterator
, the default iterator implementation that you should extend,
accepts a command object and array of iterator options in its constructor. The command object passed to the resource
iterator is expected to be ready to execute and not previously executed. The resource iterator keeps a reference of
this command and clones the original command each time a subsequent request needs to be made to fetch more data.
The most important thing (and usually the only thing) you need to do when creating a resource iterator is to implement
the sendRequest()
method of the resource iterator. The sendRequest()
method is called when you begin
iterating or if there are no resources left to iterate and it you expect to retrieve more resources by making a
subsequent request. The $this->command
property of the resource iterator is updated with a cloned copy of the
original command object passed into the constructor of the iterator. Use this command object to issue your subsequent
requests.
The sendRequest()
method must return an array of the resources you retrieved from making the subsequent call.
Returning an empty array will stop the iteration. If you suspect that your web service client will occasionally return
an empty result set but still requires further iteration, then you must implement a sort of loop in your
sendRequest()
method that will continue to issue subsequent requests until your reach the end of the paginated
result set or until additional resources are retrieved from the web service.
Beyond fetching more results, the sendRequest()
method is responsible for updating the $this->nextToken
property of the iterator. Setting this property to anything other than null tells the iterator that issuing a
subsequent request using the nextToken value will probably return more results. You must continually update this
value in your sendRequest()
method as each response is received from the web service.
Let's say you want to implement a resource iterator for the get_users
command of your web service. The
get_users
command receives a response that contains a list of users, and if there are more pages of results to
retrieve, returns a value called next_user
. This return value is known as the next token and should be used to
issue subsequent requests.
Assume the response to a get_users
command returns JSON data that looks like this:
{
"users": [
{ "name": "Craig Johnson", "age": 10 },
{ "name": "Tom Barker", "age": 20 },
{ "name": "Bob Mitchell", "age": 74 }
],
"next_user": "Michael Dowling"
}
Assume that because there is a next_user
value, there will be more users if a subsequent request is issued. If the
next_user
value is missing or null, then we know there are no more results to fetch. Let's implement a resource
iterator for this command.
namespace MyService\Model;
use Guzzle\Service\Resource\ResourceIterator;
/**
* Iterate over a get_users command
*/
class GetUsersIterator extends ResourceIterator
{
protected function sendRequest()
{
// If a next token is set, then add it to the command
if ($this->nextToken) {
$this->command->set('next_user', $this->nextToken);
}
// Execute the command and parse the result
$result = $this->command->execute();
// Parse the next token
$this->nextToken = isset($result['next_user']) ? $result['next_user'] : false;
return $result['users'];
}
}
As you can see, it's pretty simple to implement an iterator. There are a few things that you should notice from this example:
sendRequest()
method. A new command object is cloned from the
original command passed into the constructor of the iterator before the sendRequest()
method is called.
Remember that the resource iterator expects a command that has not been executed.sendRequest()
method is first called, you will not have a $this->nextToken
value, so always check
before setting it on a command. Notice that the next token is being updated each time a request is sent.