Published by nick on 01 Aug 2008 at 12:50 pm
PHP Performance tip: require versus require_once
One of the big performance oriented complaints with PHP is that it doesn’t do well with large frameworks that have a lot of included files. Symfony and Mediawiki are two that I’ve had this problem with.
Why is it slow to load a lot of files in PHP?
Let’s take a closer look.
Quick note: In this post I’ll assume you are already using a PHP Accelerator, such as APC, or Turk MMCache or eaccelerator. If not, you need to be. My personal pick is APC, mostly because it’s the preferred one at Yahoo!, which has the largest installation of PHP, and the author of PHP works there. The lead maintainer of APC also works there, so I feel good knowing that APC is well supported. There are rumors that PHP 6 will have this accelerator built in, and that it will be based on APC’s code.
With that PHP accelerator plug out of the way, let’s get back to business.
Normally when php does a require to include a file, it does a stat to see if the file has changed, and if not, loads it from the APC cache. Here’s what that looks like at the C level:
* stat64("./classes/Class1.php", {st_mode=S_IFREG|0644, st_size=2057, ...}) = 0
Tip: Want to know how to look at what code is doing at the C level? Check out this tutorial on using strace to debug web apps
Nice, simple, clean, one stat per file. Note: with APC, you can set apc.stat to off, and this will skip the above stat call as well. The downside: You have to restart apache whenever you change your code.
Now let’s take a look at what happens when you use require_once instead of require:
* lstat64("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
* lstat64("/home/webuser", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
* lstat64("/home/webuser/src", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
* lstat64("/home/webuser/src/mediawiki", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
* lstat64("/home/webuser/src/mediawiki/include_test", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
* lstat64("/home/webuser/src/mediawiki/include_test/classes", {st_mode=S_IFDIR|0755, st_size=20480, ...}) = 0
* lstat64("/home/webuser/src/mediawiki/include_test/classes/Class1.php", {st_mode=S_IFREG|0644, st_size=2058, ...}) = 0
That’s one stat for each directory. It does this for every single file you include. With require_once, php must call realpath (at the C level) to know what the actual path of the file is. Otherwise, it won’t know if require_once '../../../mydir/Class.php'; is the same as require_once '../mydir/Class.php';
Note that it also must do this for every directory in your include_path, so if you don’t have that set up correctly, this is exacerbated even more. Each one of these stats is a system call that takes time. More work for your servers and slower responses for your users.
Theory: The extra stats required for require_once and include_once introduce a lot of overhead for applications that include a lot of files.
A real world test — At Wikia, we had a common include file that was loading all of our Mediawiki extensions. It had 113 calls to require_once and 172 calls to include_once. By changing these to require and include respectively, the results were significant.
First, strace revealed that there were 2848 syscalls to serve a page, down from 4782, (-40%). Next I went to ab for more testing, and found that the average page request time went down to 36.5, from 46.5ms (-22%), and the server was able to serve 27.1 requests per second, up from 21.4 (+%22) . View the complete output from ab
Conclusion: require_once does not perform as well as require. Don’t use require_once unless you need it. require will save system calls and deliver pages faster to end users. This also applies to the include/include_once counterparts.
It would be great if someone would write up a tool that walked through your code base and made recommendations for these types of performance tweaks. Hmm….
Clint Byrum on 01 Aug 2008 at 2:46 pm #
You don’t mention the realpath cache and/or its effect (added in php 5.2.0 …). Did you try just setting the realpath cache size bigger?
http://us2.php.net/manual/en/ini.core.php#ini.realpath-cache-size
16K isn’t much room if you have hundreds of files, and lots of directories in your include_path.
nick on 01 Aug 2008 at 4:03 pm #
Good point Clint - I had assumed that everyone would already have realpath_cache_size set higher.
I talked to Rasmus about this recently at OSCON, and he said that realpath_cache definitely helps (epsecially for NFS), but that they can’t cache *misses*, so it still has to stat all of the stats for files and directories that don’t exist — the ones that it had to go through while searching your include_path.
So it’s important to realpath_cache_size set higher for large apps, but the above benchmarks are still valid.
Tony on 07 Aug 2008 at 5:35 am #
Nick:
How does this affect performance if we used absolute paths for require_once()?
How did you solve the conflicts of multiple files ‘require’ the same library?
file1.php - require(’lib1.php’);
file2.php -
require(’lib1.php’);
require(’file1.php’);
nick on 07 Aug 2008 at 6:57 am #
Great question Tony. It seems like more followup testing is in order to come up with a "best practice" way of including files. Other things to consider:
) realpath_cache_size (as Clint pointed out)
) include vs require?
) PHP Autoloading
) Roll your own auto-loading (Although Rasmus warned me that this is the worst, because it uses the less efficient type of caching with APC)
) Behaviour of APC vs other accelerators
Hmm. Maybe I shouldn’t do another write-up…
And as far as resolving conflicts, in the case above I chose that particular file because I knew it was only called once, so I could be 99% sure that it wouldn’t load all those files again. It was a "load all these extensions" type file. It will be much more work to walk through the rest of the application.
One idea I had was to use something like this at the top of the files that were being included:
class_exists(’ClassInFile’) && return;
or this:
function_exists(’functionInFile’) && return;
That way if you accidentally re-included it, the file wouldn’t be reparsed.
Hopefully this highlights the penalty with require_once though, and developers will think twice the next time, and just use require.
Recent Faves Tagged With "syscalls" : MyNetFaves on 22 Oct 2008 at 11:05 pm #
[…] public links >> syscalls PHP Performance tip: require versus require_once First saved by kbhardwaj | 1 days ago Vincenzo Iozzo: First post, me and my project First […]