Friday, March 8, 2013

Benchmarking Composer autoloading, Symfony and eZ Publish 5

Context

One fine day, I decided to benchmark how fast can eZ Publish 5, and the underlying stack, serve requests without using a true reverse proxy, in which case I would benchmark the reverse proxy and not the PHP implementation that lies behind the product.
What I did in the first place, is installing a fresh version of eZ Publish 5 directly from GitHub (commit ID: 3af1f1ad78264f9b36eb250a14ce4301d556e7ab), setting up a site based on ezdemo and take some base results:
$ sudo ab -k -n 1000 http://ezpublish-community/eng/
[...]
Requests per second:    31.38 [#/sec] (mean)
Time per request:       31.870 [ms] (mean)
Time per request:       31.870 [ms] (mean, across all concurrent requests)
Transfer rate:          1181.94 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:    30   32   2.7     32     109
Waiting:       28   30   2.6     30     106
Total:         30   32   2.7     32     109
About 32 reqs/second is what I can expect from a stock eZ Publish 5 installation of ezdemo.
The first thing to speed this up is to activate the content view caching using TTL, this is done by adding:
eng:
    content:
        view_cache: true
        ttl_cache: true
        default_ttl: 30
under the configuration of the siteaccess (here: "eng") in the ezpublish/config/ezpublish.yml file.
After I cleaned the cache manually ($ rm -rf ezpublish/cache/prod/*), I ran the benchmark again:
$ sudo ab -k -n 1000 http://ezpublish-community/eng/
[...]
Requests per second:    213.50 [#/sec] (mean)
Time per request:       4.684 [ms] (mean)
Time per request:       4.684 [ms] (mean, across all concurrent requests)
Transfer rate:          8030.29 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:     4    5   0.2      5       7
Waiting:        4    4   0.2      4       6
Total:          4    5   0.2      5       7
This is of course a very impressive win since it allows caching the content of the whole page without having to connect to the database. Again, a real proxy server would outperform this by far, but this isn't the goal of this article which focus more on the PHP parts.
213 reqs/second is certainly not bad, but let's analyze where the time is spent to see if this can be improved further. To diagnose where the possible bottlenecks are, I am going to use xhprof, that gives me the following numbers:
Total Incl. Wall Time (µs): 6,510
Number of Function Calls: 2,700
Function Name Calls Calls% Incl. Wall Time (µs) IWall% Excl. Wall Time (µs) EWall%
load::prod/classes.php 1 0.0% 518 8.0% 518 8.0%
Composer\Autoload\ClassLoader::findFile 17 0.6% 389 6.0% 308 4.7%
Symfony\Component\HttpFoundation\ResponseHeaderBag::set 41 1.5% 630 9.7% 263 4.0%
Symfony\Component\HttpFoundation\HeaderBag::set 59 2.2% 303 4.7% 237 3.6%
Symfony\Component\HttpFoundation\ServerBag::getHeaders 4 0.1% 231 3.5% 230 3.5%
Symfony\Component\HttpFoundation\HeaderBag::get 48 1.8% 194 3.0% 194 3.0%
run_init::composer/autoload_classmap.php 1 0.0% 157 2.4% 157 2.4%
run_init::prod/classes.php 1 0.0% 136 2.1% 130 2.0%
Symfony\Component\HttpFoundation\Request::normalizeQueryString 12 0.4% 127 2.0% 120 1.8%
Symfony\Component\HttpFoundation\HeaderBag::has 45 1.7% 119 1.8% 119 1.8%
Symfony\Component\HttpFoundation\ParameterBag::get 92 3.4% 114 1.8% 114 1.8%
main() 1 0.0% 6,51 100.0% 112 1.7%
Symfony\Component\HttpFoundation\Request::prepareBaseUrl 4 0.1% 462 7.1% 111 1.7%
file_get_contents 6 0.2% 89 1.4% 89 1.4%
Symfony\Component\HttpFoundation\Response::prepare 4 0.1% 154 2.4% 87 1.3%
DateTime::__construct 12 0.4% 84 1.3% 84 1.3%
Symfony\Component\HttpFoundation\HeaderBag::__construct 8 0.3% 635 9.8% 81 1.2%
ComposerAutoloaderInit5b5f57c871b8c5b6e57138c3683a3315::getLoader 1 0.0% 399 6.1% 81 1.2%
Symfony\Component\HttpFoundation\Response::sendContent 1 0.0% 80 1.2% 80 1.2%
Symfony\Component\HttpFoundation\HeaderBag::parseCacheControl 12 0.4% 127 2.0% 70 1.1%
file_exists 17 0.6% 70 1.1% 70 1.1%
Symfony\Component\HttpFoundation\Request::create 3 0.1% 420 6.5% 69 1.1%
Symfony\Component\HttpKernel\HttpCache\Store::lookup 4 0.1% 1,198 18.4% 66 1.0%
DateTime::createFromFormat 8 0.3% 61 0.9% 61 0.9%
Symfony\Component\HttpFoundation\Response::getAge 11 0.4% 343 5.3% 61 0.9%
preg_match_all 12 0.4% 57 0.9% 57 0.9%
Symfony\Component\HttpKernel\HttpCache\HttpCache::handle@1 2 0.1% 2,351 36.1% 56 0.9%
Symfony\Component\HttpFoundation\Request::isSecure 16 0.6% 87 1.3% 55 0.8%
Symfony\Component\HttpFoundation\Request::prepareRequestUri 4 0.1% 250 3.8% 53 0.8%
Symfony\Component\ClassLoader\ClassCollectionLoader::load 1 0.0% 827 12.7% 52 0.8%
Symfony\Component\HttpFoundation\Request::initialize 4 0.1% 398 6.1% 52 0.8%
get_declared_classes 1 0.0% 51 0.8% 51 0.8%
Symfony\Component\HttpFoundation\Request::getHost 8 0.3% 112 1.7% 51 0.8%
Note that the initial benchmark was done without xhprof loaded so that the performances are optimal. This is why you can see request time about 6ms with xhprof, while the average is in reality about 4.684ms. This is due to xhprof's own overhead.
The first thing to note is that the most time consuming operation (based on "Excl. Wall Time" which doesn't take into account the time spent in children calls) is: load::prod/classes.php with about 8%. This file is supposed to contain the most frequently used classes combined in one file to minimize stats calls when a PHP bytecode cache mechanism is used like APC or Zend Optimizer+. This is triggered by: $kernel->loadClassCache(); in web/index.php file.
Removing that call and benchmarking the page again, here is the result I get:
$ sudo ab -k -n 1000 http://ezpublish-community/eng/
[...]
Requests per second:    254.53 [#/sec] (mean)
Time per request:       3.929 [ms] (mean)
Time per request:       3.929 [ms] (mean, across all concurrent requests)
Transfer rate:          9573.73 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:     4    4   0.2      4       5
Waiting:        3    4   0.2      4       5
Total:          4    4   0.2      4       5
The improvement from 213.50 to 254.53 requests/sec. by removing the call to loadClassCache() means that by including this file we are loading way more classes than what is necessary to serve a cached page. This can probably be tuned, but since we just have better performance by omitting this optimization, let's just continue without it.
Running xhprof again, those are the new profiling numbers.
Total Incl. Wall Time (µs): 6,061
Number of Function Calls: 2,784
Function Name Calls Calls% Incl. Wall Time (µs) IWall% Excl. Wall Time (µs) EWall%
Composer\Autoload\ClassLoader::findFile 23 0.8% 468 7.7% 394 6.5%
Symfony\Component\HttpFoundation\ResponseHeaderBag::set 41 1.5% 647 10.7% 261 4.3%
Symfony\Component\HttpFoundation\HeaderBag::set 59 2.1% 310 5.1% 245 4.0%
Symfony\Component\HttpFoundation\ServerBag::getHeaders 4 0.1% 239 3.9% 238 3.9%
Symfony\Component\HttpFoundation\HeaderBag::get 48 1.7% 208 3.4% 200 3.3%
run_init::composer/autoload_classmap.php 1 0.0% 149 2.5% 149 2.5%
Symfony\Component\HttpFoundation\HeaderBag::has 45 1.6% 135 2.2% 135 2.2%
Symfony\Component\HttpFoundation\Request::normalizeQueryString 12 0.4% 140 2.3% 129 2.1%
Symfony\Component\HttpFoundation\Request::prepareBaseUrl 4 0.1% 513 8.5% 114 1.9%
Symfony\Component\HttpFoundation\ParameterBag::get 92 3.3% 111 1.8% 110 1.8%
main() 1 0.0% 6,061 100.0% 106 1.7%
Symfony\Component\HttpFoundation\Response::prepare 4 0.1% 182 3.0% 90 1.5%
Symfony\Component\HttpFoundation\HeaderBag::__construct 8 0.3% 665 11.0% 88 1.5%
file_get_contents 6 0.2% 86 1.4% 86 1.4%
DateTime::__construct 12 0.4% 84 1.4% 84 1.4%
ComposerAutoloaderInit5b5f57c871b8c5b6e57138c3683a3315::getLoader 1 0.0% 372 6.1% 78 1.3%
Composer\Autoload\ClassLoader::loadClass 13 0.5% 1,022 16.9% 75 1.2%
Symfony\Component\HttpFoundation\HeaderBag::parseCacheControl 12 0.4% 128 2.1% 74 1.2%
Symfony\Component\HttpFoundation\Response::sendContent 1 0.0% 71 1.2% 71 1.2%
file_exists 23 0.8% 71 1.2% 71 1.2%
Symfony\Component\HttpFoundation\Request::initialize 4 0.1% 586 9.7% 69 1.1%
Symfony\Component\HttpKernel\HttpCache\Store::lookup 4 0.1% 1,380 22.8% 68 1.1%
Symfony\Component\HttpFoundation\Request::create 3 0.1% 414 6.8% 66 1.1%
Symfony\Component\HttpFoundation\Response::getAge 11 0.4% 339 5.6% 61 1.0%
Symfony\Component\HttpKernel\HttpCache\HttpCache::handle@1 2 0.1% 2,512 41.4% 60 1.0%
Symfony\Component\HttpFoundation\Request::isSecure 16 0.6% 90 1.5% 58 1.0%
DateTime::createFromFormat 8 0.3% 57 0.9% 57 0.9%
Symfony\Component\HttpFoundation\Request::getHost 8 0.3% 125 2.1% 56 0.9%
Symfony\Component\HttpFoundation\Request::prepareRequestUri 4 0.1% 281 4.6% 54 0.9%
preg_match_all 12 0.4% 54 0.9% 54 0.9%
Symfony\Component\HttpFoundation\HeaderBag::getCacheControlHeader 7 0.3% 66 1.1% 52 0.9%
Symfony\Component\HttpKernel\HttpCache\HttpCache::handle 1 0.0% 4,172 68.8% 50 0.8%
eZ\Publish\Core\MVC\Symfony\Cache\Http\LocationAwareStore::getPath 8 0.3% 99 1.6% 50 0.8%
Symfony\Component\HttpKernel\HttpCache\HttpCache::restoreResponseBody@1 2 0.1% 1,047 17.3% 49 0.8%
load::HttpFoundation/Request.php 1 0.0% 47 0.8% 47 0.8%
Those numbers shows us that autoloading is one of the part that takes the most time and made me remember about the web/index.php file containing some APC related commented code:
// Use APC for autoloading to improve performance:
// Change 'ezpublish' to a unique prefix in order to prevent cache key conflicts
// with other applications also using APC.
//
// ( Not needed when using `php composer.phar dump-autoload --optimize` )
/*
$loader = new ApcClassLoader( 'ezpublish', $loader );
$loader->register( true );
*/
Reading at this comment, I realize I should also give a try to the --optimize flag proposed by Composer. Let's try both approaches.

Approach 1: Activating ApcClassLoader

After uncommenting the relevant lines in web/index.php file, I run the benchmark again:
$ sudo ab -k -n 1000 http://ezpublish-community/eng/
[...]
Requests per second:    270.42 [#/sec] (mean)
Time per request:       3.698 [ms] (mean)
Time per request:       3.698 [ms] (mean, across all concurrent requests)
Transfer rate:          10171.41 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:     3    4   0.2      4       5
Waiting:        3    3   0.2      3       4
Total:          3    4   0.2      4       5
Performing that simple change decreased the request time by about 6%, let's look at xhprof's data again:
Total Incl. Wall Time (µs): 5,696
Number of Function Calls: 2,549
Function Name Calls Calls% Incl. Wall Time (µs) IWall% Excl. Wall Time (µs) EWall%
Symfony\Component\HttpFoundation\ServerBag::getHeaders 4 0.2% 278 4.9% 269 4.7%
Symfony\Component\HttpFoundation\ResponseHeaderBag::set 41 1.6% 627 11.0% 249 4.4%
Symfony\Component\HttpFoundation\HeaderBag::set 59 2.3% 307 5.4% 241 4.2%
Symfony\Component\HttpFoundation\HeaderBag::get 48 1.9% 187 3.3% 187 3.3%
run_init::composer/autoload_classmap.php 1 0.0% 164 2.9% 164 2.9%
Symfony\Component\HttpFoundation\HeaderBag::has 45 1.8% 133 2.3% 129 2.3%
Symfony\Component\HttpFoundation\Request::normalizeQueryString 12 0.5% 132 2.3% 119 2.1%
main() 1 0.0% 5,696 100.0% 116 2.0%
Symfony\Component\HttpFoundation\Request::prepareBaseUrl 4 0.2% 469 8.2% 116 2.0%
Symfony\Component\HttpFoundation\ParameterBag::get 92 3.6% 111 1.9% 111 1.9%
Symfony\Component\HttpFoundation\HeaderBag::__construct 8 0.3% 644 11.3% 89 1.6%
Symfony\Component\HttpFoundation\Response::prepare 4 0.2% 167 2.9% 87 1.5%
file_get_contents 6 0.2% 87 1.5% 87 1.5%
DateTime::__construct 12 0.5% 86 1.5% 86 1.5%
ComposerAutoloaderInit155d41b24b236d0fd561c62d5063d8bb::getLoader 1 0.0% 400 7.0% 84 1.5%
Symfony\Component\HttpFoundation\HeaderBag::parseCacheControl 12 0.5% 129 2.3% 72 1.3%
Symfony\Component\ClassLoader\ApcClassLoader::loadClass 12 0.5% 633 11.1% 72 1.3%
Symfony\Component\HttpFoundation\Response::sendContent 1 0.0% 71 1.2% 71 1.2%
Symfony\Component\HttpKernel\HttpCache\Store::lookup 4 0.2% 1,313 23.1% 70 1.2%
Symfony\Component\HttpFoundation\Request::create 3 0.1% 452 7.9% 69 1.2%
Symfony\Component\HttpFoundation\Request::initialize 4 0.2% 561 9.8% 69 1.2%
Symfony\Component\HttpFoundation\Response::getAge 11 0.4% 341 6.0% 63 1.1%
load::HttpFoundation/Request.php 1 0.0% 59 1.0% 59 1.0%
Symfony\Component\HttpFoundation\Request::isSecure 16 0.6% 90 1.6% 58 1.0%
DateTime::createFromFormat 8 0.3% 58 1.0% 58 1.0%
Composer\Autoload\ClassLoader::findFile 2 0.1% 68 1.2% 57 1.0%
preg_match_all 12 0.5% 57 1.0% 57 1.0%
Symfony\Component\HttpFoundation\Request::getHost 8 0.3% 114 2.0% 55 1.0%
From those numbers, we can see that Composer's methods that appeared previously in the topmost part of the most costing operations have disappeared
Now is time to compare this with the suggested second approach.

Approach 2: Using Composer classmap generation

Composer has a feature that generates a classmap of all classes by running:
$ ./composer.phar dump-autoload --optimize
This command created a 662KiB vendor/composer/autoload_classmap.php file containing an array that is a hash composed of the class name as index and the path to the file containing the class definition as value. At the time I am writing this post, this array is composed of 4168 entries.
Benchmarking again to see how it improves the average request time:
$ sudo ab -k -n 1000 http://ezpublish-community/eng/
[...]
Requests per second:    197.95 [#/sec] (mean)
Time per request:       5.052 [ms] (mean)
Time per request:       5.052 [ms] (mean, across all concurrent requests)
Transfer rate:          7445.35 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:     5    5   4.0      5     131
Waiting:        4    5   3.8      4     126
Total:          5    5   4.0      5     131
Although it should give us the most efficiant autoloading mechanism, it actually slows things down (from 254.53 reqs/second to 197.95). The reason being that even if the file is cached by APC, the PHP array containing the map with more than 4100 entries needs to be re-created at every single request. This is easily discovered looking at xhprof run report:
Total Incl. Wall Time (µs): 6,169
Number of Function Calls: 2,480
Function Name Calls Calls% Incl. Wall Time (µs) IWall% Excl. Wall Time (µs) EWall%
run_init::composer/autoload_classmap.php 1 0.0% 775 12.6% 775 12.6%
Symfony\Component\HttpFoundation\ServerBag::getHeaders 4 0.2% 260 4.2% 258 4.2%
Symfony\Component\HttpFoundation\ResponseHeaderBag::set 41 1.7% 616 10.0% 248 4.0%
Symfony\Component\HttpFoundation\HeaderBag::set 59 2.4% 303 4.9% 237 3.8%
Symfony\Component\HttpFoundation\HeaderBag::get 48 1.9% 187 3.0% 187 3.0%
Symfony\Component\HttpFoundation\HeaderBag::has 45 1.8% 122 2.0% 122 2.0%
Symfony\Component\HttpFoundation\Request::normalizeQueryString 12 0.5% 133 2.2% 118 1.9%
Symfony\Component\HttpFoundation\Request::prepareBaseUrl 4 0.2% 499 8.1% 115 1.9%
main() 1 0.0% 6,169 100.0% 112 1.8%
Symfony\Component\HttpFoundation\ParameterBag::get 92 3.7% 114 1.8% 111 1.8%
file_get_contents 6 0.2% 97 1.6% 97 1.6%
Symfony\Component\HttpFoundation\HeaderBag::__construct 8 0.3% 629 10.2% 86 1.4%
Symfony\Component\HttpFoundation\Response::prepare 4 0.2% 156 2.5% 85 1.4%
ComposerAutoloaderInit4fe043dafc1de5ca2f306c3fb5c6bd90::getLoader 1 0.0% 1,012 16.4% 82 1.3%
DateTime::__construct 12 0.5% 80 1.3% 80 1.3%
Symfony\Component\HttpFoundation\Response::sendContent 1 0.0% 77 1.2% 77 1.2%
Composer\Autoload\ClassLoader::loadClass 13 0.5% 593 9.6% 75 1.2%
Symfony\Component\HttpFoundation\HeaderBag::parseCacheControl 12 0.5% 129 2.1% 72 1.2%
Symfony\Component\HttpFoundation\Request::create 3 0.1% 434 7.0% 68 1.1%
Symfony\Component\HttpKernel\HttpCache\Store::lookup 4 0.2% 1,296 21.0% 66 1.1%
Symfony\Component\HttpFoundation\Request::initialize 4 0.2% 535 8.7% 66 1.1%
Symfony\Component\HttpFoundation\Response::getAge 11 0.4% 342 5.5% 63 1.0%
Symfony\Component\HttpKernel\HttpCache\HttpCache::handle@1 2 0.1% 2,384 38.6% 58 0.9%
DateTime::createFromFormat 8 0.3% 58 0.9% 58 0.9%
preg_match_all 12 0.5% 57 0.9% 57 0.9%
Symfony\Component\HttpFoundation\Request::isSecure 16 0.6% 89 1.4% 57 0.9%
Symfony\Component\HttpFoundation\Request::prepareRequestUri 4 0.2% 268 4.3% 54 0.9%
Symfony\Component\HttpFoundation\Request::getHost 8 0.3% 129 2.1% 53 0.9%
load::HttpFoundation/Request.php 1 0.0% 48 0.8% 48 0.8%
Knowing that most of the classes involved in the stack relies on PSR-0 let's try a simpler approach...

Simplifying the autoloading

PSR-0 gives us a nice "fully qualified classname to filename" approach but things gets a bit more complicated given that the different dependencies installed by Composer may have common namespaces (sub)paths and all of them starting from a different root directory.
Using Linux, I thought about leveraging my filesystem abilities and PSR-0 at the same time: let's create a tree made of symbolink links to recreate a straightforward "fully qualified classname to filename" strategy again:
$ mkdir vendor/PSR-0
$ cd vendor/PSR-0
$ ln -s ../kriswallsmith/assetic/src/Assetic/ ../ezsystems/ezpublish-kernel/eZ/ ../ezsystems/demobundle/EzSystems/ ../monolog/monolog/src/Monolog/ ../psr/log/Psr/ ../sensio/generator-bundle/Sensio/ ../twig/twig/lib/Twig/ ../qafoo/rmf/src/main/Qafoo/ .
$ mkdir Symfony
$ cd Symfony
$ ln -s ../../symfony/symfony/src/Symfony/Bridge/ ../../../vendor/symfony-cmf/routing/Symfony/Cmf/ ../../symfony/symfony/src/Symfony/Component/ .
$ mkdir Bundle
$ cd Bundle
$ ln -s ../../../symfony/assetic-bundle/Symfony/Bundle/AsseticBundle/ ../../../symfony/symfony/src/Symfony/Bundle/FrameworkBundle/ ../../../symfony/monolog-bundle/Symfony/Bundle/MonologBundle/ ../../../symfony/symfony/src/Symfony/Bundle/SecurityBundle/ ../../../symfony/swiftmailer-bundle/Symfony/Bundle/SwiftmailerBundle/ ../../../symfony/symfony/src/Symfony/Bundle/TwigBundle/ ../../../symfony/symfony/src/Symfony/Bundle/WebProfilerBundle/ .
... and modify the autoloader file (ezpublish/autoload.php) to contain only:
spl_autoload_register(
    function ( $className )
    {
        if (
            strpos( $className, "Symfony\\" ) === 0 ||
            strpos( $className, "Sensio\\" ) === 0 ||
            strpos( $className, "eZ\\" ) === 0 ||
            strpos( $className, "EzSystems\\" ) === 0 ||
            strpos( $className, "Monolog\\" ) === 0 ||
            strpos( $className, "Psr\\" ) === 0 ||
            strpos( $className, "Twig_" ) === 0 ||
            strpos( $className, "Assetic\\" ) === 0
        )
        {
            @include __DIR__ . "/../vendor/PSR-0/" . str_replace( array( "_", "\\" ), "/", $className ) . ".php";
        }
    },
    true
);

require __DIR__ . "/../ezpublish_legacy/autoload.php";
Now that this is in place, let's benchmark it:
$ sudo ab -k -n 1000 http://ezpublish-community/eng/
[...]
Requests per second:    286.38 [#/sec] (mean)
Time per request:       3.492 [ms] (mean)
Time per request:       3.492 [ms] (mean, across all concurrent requests)
Transfer rate:          11286.49 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:     3    3   0.2      3       6
Waiting:        3    3   0.2      3       5
Total:          3    3   0.2      3       6
Using this technique, we gained yet another 16 requests/sec over ApcClassLoader that was the best results so far. Let's inspect xhprof once again:
Function Name Calls Calls% Incl. Wall Time (µs) IWall% Excl. Wall Time (µs) EWall%
Symfony\Component\HttpFoundation\ResponseHeaderBag::set 42 1.7% 669 11.8% 267 4.7%
Symfony\Component\HttpFoundation\ServerBag::getHeaders 4 0.2% 262 4.6% 259 4.6%
Symfony\Component\HttpFoundation\HeaderBag::set 60 2.4% 307 5.4% 243 4.3%
Symfony\Component\HttpFoundation\HeaderBag::get 48 1.9% 204 3.6% 204 3.6%
Symfony\Component\HttpFoundation\HeaderBag::has 45 1.8% 134 2.4% 129 2.3%
{closure} 15 0.6% 713 12.6% 126 2.2%
Symfony\Component\HttpFoundation\Request::normalizeQueryString 12 0.5% 131 2.3% 119 2.1%
Symfony\Component\HttpFoundation\ParameterBag::get 92 3.6% 118 2.1% 118 2.1%
Symfony\Component\HttpFoundation\Request::prepareBaseUrl 4 0.2% 479 8.5% 116 2.1%
main() 1 0.0% 5,646 100.0% 110 1.9%
DateTime::createFromFormat 8 0.3% 103 1.8% 103 1.8%
Symfony\Component\HttpFoundation\Response::prepare 4 0.2% 171 3.0% 97 1.7%
DateTime::__construct 12 0.5% 96 1.7% 96 1.7%
Symfony\Component\HttpFoundation\HeaderBag::__construct 8 0.3% 678 12.0% 92 1.6%
file_get_contents 6 0.2% 88 1.6% 88 1.6%
Symfony\Component\HttpFoundation\HeaderBag::parseCacheControl 12 0.5% 137 2.4% 71 1.3%
Symfony\Component\HttpFoundation\Request::create 3 0.1% 434 7.7% 70 1.2%
Symfony\Component\HttpKernel\HttpCache\Store::lookup 4 0.2% 1,347 23.9% 70 1.2%
Symfony\Component\HttpFoundation\Request::initialize 4 0.2% 549 9.7% 68 1.2%
Symfony\Component\HttpFoundation\Response::getAge 11 0.4% 411 7.3% 66 1.2%
preg_match_all 12 0.5% 66 1.2% 66 1.2%
Symfony\Component\HttpFoundation\Response::sendContent 1 0.0% 63 1.1% 63 1.1%
Symfony\Component\HttpFoundation\HeaderBag::getCacheControlHeader 7 0.3% 75 1.3% 62 1.1%
Symfony\Component\HttpKernel\HttpCache\HttpCache::handle@1 2 0.1% 2,500 44.3% 61 1.1%
Symfony\Component\HttpFoundation\Request::isSecure 16 0.6% 93 1.6% 60 1.1%
Symfony\Component\HttpFoundation\Request::prepareRequestUri 4 0.2% 259 4.6% 53 0.9%
load::HttpFoundation/Request.php 1 0.0% 52 0.9% 52 0.9%
Symfony\Component\HttpFoundation\Request::getHost 8 0.3% 115 2.0% 51 0.9%
It looks like from the latest technique used, we finally get rid of the autoloading performance impact. Maybe Composer could automate the creation of such a PSR-0 tree? Well, if you care, this has been asked already ;-) https://github.com/composer/composer/issues/1529
From the latest xhprof report, it seems that Symfony HttpFoundation components are the next bottlenecks.

Some craziness: s/Symfony/Cymfony/

Now, what if... those Symfony components were built in C? Well, the idea sounds a bit crazy at first but I started creating a simple PHP extension written in C (that I named "Cymfony") that implements natively the Symfony\\Component\\HttpFoundation\\HeaderBag and Symfony\\Component\\HttpFoundation\\ParameterBag classes. Those were simple enough to be written quickly while appearing in the xhprof report.
Let's activate the extension and run yet another benchmark:
$ sudo ab -k -n 1000 http://ezpublish-community/eng/
[...]
Requests per second:    328.93 [#/sec] (mean)
Time per request:       3.040 [ms] (mean)
Time per request:       3.040 [ms] (mean, across all concurrent requests)
Transfer rate:          12963.84 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:     3    3   0.2      3       5
Waiting:        3    3   0.2      3       4
Total:          3    3   0.2      3       5
This is a pretty good result! By reimplementing classes natively, with 100% compatibility, I have been able to jump from 286.38 request/sec to 328.93 which is about a 15% boost. This percentage is very close to the original amount of time spent in the classes in their PHP implementation:
Symfony\Component\HttpFoundation\HeaderBag::set 4.3%
Symfony\Component\HttpFoundation\HeaderBag::get 3.6%
Symfony\Component\HttpFoundation\HeaderBag::has 2.3%
Symfony\Component\HttpFoundation\ParameterBag::get 2.1%
Symfony\Component\HttpFoundation\HeaderBag::__construct 1.6%
Symfony\Component\HttpFoundation\HeaderBag::parseCacheControl 1.3%
Symfony\Component\HttpFoundation\HeaderBag::getCacheControlHeader 1.1%
17.4%
Let's see if xhprof confirms those methods are not being the top most consuming parts:
Function Name Calls Calls% Incl. Wall Time (µs) IWall% Excl. Wall Time (µs) EWall%
Symfony\Component\HttpFoundation\ResponseHeaderBag::set 42 2.3% 436 9.2% 283 6.0%
Symfony\Component\HttpFoundation\ServerBag::getHeaders 4 0.2% 255 5.4% 253 5.3%
Symfony\Component\HttpFoundation\Request::prepareBaseUrl 4 0.2% 396 8.3% 129 2.7%
Symfony\Component\HttpFoundation\Request::normalizeQueryString 12 0.7% 134 2.8% 125 2.6%
{closure} 13 0.7% 591 12.4% 119 2.5%
Symfony\Component\HttpFoundation\Response::prepare 4 0.2% 123 2.6% 112 2.4%
main() 1 0.1% 4,756 100.0% 111 2.3%
file_get_contents 6 0.3% 95 2.0% 95 2.0%
DateTime::__construct 12 0.7% 89 1.9% 89 1.9%
Symfony\Component\HttpFoundation\HeaderBag::set 42 2.3% 80 1.7% 80 1.7%
Symfony\Component\HttpFoundation\Request::create 3 0.2% 380 8.0% 77 1.6%
Symfony\Component\HttpFoundation\HeaderBag::__construct 8 0.4% 400 8.4% 76 1.6%
Symfony\Component\HttpFoundation\Request::initialize 4 0.2% 421 8.9% 70 1.5%
Symfony\Component\HttpKernel\HttpCache\Store::lookup 4 0.2% 1,121 23.6% 69 1.5%
Symfony\Component\HttpFoundation\Response::getAge 11 0.6% 257 5.4% 67 1.4%
Symfony\Component\HttpFoundation\Request::prepareRequestUri 4 0.2% 197 4.1% 67 1.4%
Symfony\Component\HttpFoundation\HeaderBag::getDate 8 0.4% 66 1.4% 66 1.4%
Symfony\Component\HttpFoundation\Request::isSecure 16 0.9% 63 1.3% 63 1.3%
Symfony\Component\HttpFoundation\Response::sendContent 1 0.1% 63 1.3% 63 1.3%
Symfony\Component\HttpKernel\HttpCache\HttpCache::handle@1 2 0.1% 2,049 43.1% 59 1.2%
load::HttpFoundation/Request.php 1 0.1% 55 1.2% 55 1.2%
Symfony\Component\HttpFoundation\Request::getHost 8 0.4% 90 1.9% 55 1.2%
Symfony\Component\HttpFoundation\Response::getMaxAge 10 0.6% 95 2.0% 54 1.1%
Symfony\Component\HttpKernel\HttpCache\HttpCache::restoreResponseBody@1 2 0.1% 885 18.6% 51 1.1%
Symfony\Component\HttpKernel\HttpCache\HttpCache::handle 1 0.1% 3,460 72.8% 49 1.0%
load::HttpKernel/Kernel.php 1 0.1% 49 1.0% 49 1.0%
eZ\Publish\Core\MVC\Symfony\Cache\Http\LocationAwareStore::getPath 8 0.4% 87 1.8% 48 1.0%
The rewritten classes are effectively more efficient and doesn't take as much processing time. This is encouraging to try reimplementing classes like Symfony\\Component\\HttpFoundation\\ResponseHeaderBag, Symfony\\Component\\HttpFoundation\\ServerBag or Symfony\\Component\\HttpFoundation\\Request. Those are, however, much more time consuming to implement in C.

Conclusion

In my previous blog post I was suggesting using a PSR-0 tree already, but I lacked some stats, this issue is now solved.
Spending some time tracking the bottlenecks can drastically improve the throughput, autoloading being frequently one of them. Hopefully, we do have Xdebug and xhprof that are two great tools to profile a PHP application.
The Cymfony project that I started as a proof-of-concept might, in fact, interest some people working with Symfony 2 and that want to optimize some parts of the framework just by installing a PHP extension. However, speeding up the path of delivering cached page is probably not the best use case because if someone is able to install a PHP extension, he is probably able to install a decent reverse proxy which would speed up the throughput much more! What are the classes that might be worth being optimized in Symfony 2 according to you?

Saturday, January 26, 2013

Composer: Speeding up class autoloading

The Use Case

Those of you using Composer have probably tried speeding up the PHP autoloading mechanism using the option:
--optimize-autoloader (-o): Convert PSR-0 autoloading to classmap to get a faster autoloader. This is recommended especially for production, but can take a bit of time to run so it is currently not done by default.

The Goal

While it can certainly make your autoloading faster it can be further optimized if you are using PSR-0-like autoloading mechanism and have quite a lot of classes.

The Problem

The problem with the classmap strategy and the nature of PHP is that there is no (easy) way to have a persistent variable across requests containing the classmap. Even if you are using APC, a PHP file returning a big array will always take some CPU cycles and memory because the array still needs to be recreated from the cached opcodes. This can even take a big portion of your request's response time when you have hundreds or thousands of classes like it is the case with eZ Publish 5 being based on Symfony, where about 2 600 classes are involved.

The Solution

I came across a quite simple solution that can surely be automatized from inside Composer itself.
Given that most dependencies of eZ Publish 5 are using the PSR-0 standard, I created a vendor/PSR-0 directory containing symbolic links to all relevant elements in such a way that there is a direct match between a namespaced class and the file system from a unique root directory:

$ tree vendor/PSR-0/
vendor/PSR-0/
├── Assetic -> ../kriswallsmith/assetic/src/Assetic/
├── eZ -> ../ezsystems/ezpublish/eZ/
├── EzSystems -> ../../src/EzSystems/
├── Monolog -> ../monolog/monolog/src/Monolog/
├── Psr -> ../psr/log/Psr
├── Sensio -> ../sensio/generator-bundle/Sensio/
├── Symfony
│   ├── Bridge -> ../../symfony/symfony/src/Symfony/Bridge/
│   ├── Bundle
│   │   ├── AsseticBundle -> ../../../symfony/assetic-bundle/Symfony/Bundle/AsseticBundle/
│   │   ├── FrameworkBundle -> ../../../symfony/symfony/src/Symfony/Bundle/FrameworkBundle/
│   │   ├── MonologBundle -> ../../../symfony/monolog-bundle/Symfony/Bundle/MonologBundle/
│   │   ├── SecurityBundle -> ../../../symfony/symfony/src/Symfony/Bundle/SecurityBundle/
│   │   ├── SwiftmailerBundle -> ../../../symfony/swiftmailer-bundle/Symfony/Bundle/SwiftmailerBundle/
│   │   ├── TwigBundle -> ../../../symfony/symfony/src/Symfony/Bundle/TwigBundle/
│   │   └── WebProfilerBundle -> ../../../symfony/symfony/src/Symfony/Bundle/WebProfilerBundle/
│   ├── Cmf -> ../../../vendor/symfony-cmf/routing/Symfony/Cmf/
│   └── Component -> ../../symfony/symfony/src/Symfony/Component/
└── Twig -> ../twig/twig/lib/Twig/

Once this has been setup, instead of using the usual Composer's autoload.php file, a custom autoloader is created for loading classes directly from that tree:

spl_autoload_register(
    function ( $className )
    {
        if (
            strpos( $className, "Symfony\\" ) === 0 ||
            strpos( $className, "Sensio\\" ) === 0 ||
            strpos( $className, "eZ\\" ) === 0 ||
            strpos( $className, "EzSystems\\" ) === 0 ||
            strpos( $className, "Monolog\\" ) === 0 ||
            strpos( $className, "Psr\\" ) === 0 ||
            strpos( $className, "Twig_" ) === 0 ||
            strpos( $className, "Assetic\\" ) === 0
        )
        {
            if (
                file_exists(
                    $path = __DIR__ . "/../vendor/PSR-0/" . str_replace( array( "_", "\\" ), "/", $className ) . ".php"
                )
            )
            {
                require $path;
            }
        }
        else
        {
            return;
        }
    },
    true
);

Conclusion

Once I did that change on a typical eZ Publish site configured with ezdemo package, my laptop was able to serve 227 reqs/second instead of the initial 176 reqs/second that is a 29% boost.

Action point

Composer team: any chance to be able to generate this automagically? :-)

Thursday, May 27, 2010

Readable PHP code #2 Make your API handle more!

Context

APIs are often designed to operate with scalars. As described in the following example, the function addContact() operates on a single element:

<?php
class User {
    protected 
$contacts = array();

    function 
addContact($contact) {
        
$this->contacts[] = $contact;
    }
}
?>

In a context where many contacts have to be added using the above API, code usually looks like:

<?php
// Some contacts
$contacts = array("Paul""John""Maria");

$user = new User();
// Looping over contacts to add them
foreach ($contacts as $contact) {
    
$user->addContact($contact);
}
?>

Inner looping

A way to avoid repeating this looping everywhere would be to design the API to work with array of contacts:

<?php
class User {
    protected 
$contacts = array();

    
function addContacts($contacts) {
        
foreach ($contacts as $contact) {
            
$this->contacts[] = $contact;
        }
    }

}
?>

This might make the code, where loops are in use, somewhat clearer:

<?php
// Some contacts
$contacts = array("Paul""John""Maria");

$user = new User();
$user->addContacts($contacts);
?>

With the benefit of speeding the processing a little bit as only one function call is issued!

The side effect is that adding only one contact is not as elegant:

<?php
$contact "Julia";

$user = new User();
$user->addContacts(array($contact));
?>

A nice PHP trick

The good news is that PHP provides a nice array cast operator: (array) which will transform a scalar value into an array. As described on the manual page about arrays:

"For any of the types: integer, float, string, boolean and resource, converting a value to an array results in an array with a single element with index zero and the value of the scalar which was converted. In other words, (array)$scalarValue is exactly the same as array($scalarValue)."

Previous example can be transformed to play nicely with both scalars and arrays:

<?php
class User {
    protected 
$contacts = array();

    function 
addContacts($contacts) {
        foreach (
(array) $contacts as $contact) {
            
$this->contacts[] = $contact;
        }
    }
}
?>

Function addContacts() can now be used the following way with scalars:

<?php
$contact 
"Julia";

$user = new User();
$user->addContacts($contact);
?>

OOPs!

This is a nice trick to define APIs to be used with both scalars and arrays, however this will not work when using objects! PHP is able to cast an objects as an array, but this will give you an access to its properties which is not the intended purpose.

If your contacts are objects you will have to modify the addContacts() function to something like:

<?php
class User {
    protected 
$contacts = array();

    function 
addContacts($contacts) {
        
if (is_array($contacts)) {
            foreach (
$contacts as $contact) {
                
$this->contacts[] = $contact;
            }
        } else {
            
$this->contacts[] = $contacts;
        }
    
}
}
?>

This may not be as elegant as the array casting method. However it will enable your API to work seemlessly with both scalars and arrays when objects are used.

The benefit of working with an array capable API is that you might sometimes optimize the operations. For example, in the case of retrieving or deleting elements from a database, you might want to use the "WHERE id IN (...)" syntax to match multiple elements rather than one by one. In the X examples that has been used in this article, an interesting optimization is to use the native array_merge() function which avoids reinventing the wheel by looping over elements using a foreach construct and adding elements one by one:

<?php
class User {
    protected 
$contacts = array();

    function 
addContacts($contacts) {
        if (
is_array($contacts)) {
            
$this->contacts array_merge($this->contacts$contacts);
        
} else {
            
$this->contacts[] = $contacts;
        }
    }
}
?>

Conclusion

In this article we have seen the advantages of creating an API which can handle multiple elements at once. If there is a benefit in terms of speed (which heavely depends on your business logic), don't forget that code readability is of higher importance too! Hopefully, this tip should improve both

For those who mind about performance, doing:

<?php
// Adding 5.000.000 contacts
$user->addContacts(range(15e6));
?>

is about 40% faster than:

<?php
foreach (range(15e6) as $contact) {
    
$user->addContact($contact);
}
?>

Thanks to Paul Dragoonis, Paul Borgermans and Jérôme Renard for reviewing this article