Archive for the 'php' Category

Published by nick on 23 Sep 2008

Good bye OpenX. Hello Google Ad Manager.

Websites need ads. It’s one of the things that make the internet go ’round. That and porn.

If a startup wants to have a free solution for serving ads, there has really only been one choice for many years, OpenX, formerly known as phpAdsNew. OpenX has been at Wikia for quite some time. After hitting some brick walls with scalability, having downtime/slowness issues, and getting frustrated with basic functionality that work without taking down the server, I decided it was time to try something new.

I looked into Google Ad Manager over the past few days. It seems like it can do the job, and last night I wrote all the code. Today I switched all of the wikia.com websites from OpenX for serving Spotlight Ads to Google Ad Manager.

Here are the compelling reasons I found for switching.

  • OpenX is crap — It is possible to write high scale web applications in PHP/Mysql. I’ve done it, multiple times. OpenX has not. Sorry for being a bit arrogant here, but I will happily engage an OpenX architect and question numerous design decisions. As an example: Logging impressions to a relational database in real time is a horrible idea. Horrible. It will never scale. Telling people that the right way to solve this problem is by logging on the app servers? Even worse.
  • Google’s infrastructure — Even if OpenX wasn’t horrible, I still don’t want to have to worry about buying servers, system administration time, and bandwith for my ad infrastructure. I put more faith in Google’s and Yahoo!’s infrastructure than anything a startup can build.
  • It’s easier to use — I found the interface and code setup far more intutive than OpenX. So have the 4 other people that I’ve been working with to load ads. They love how simple Google Ad Manager is. That being said, there are a couple of less-than-intuitive things with Google Ad Manager, so it wasn’t completely painless. Maybe Apple needs to come out with iAdManager? :)
  • It’s Free — Ok. Did you guys hear that? Free. Free hosting of the graphics. Free server infrastructure. Estimates are that this will save Wikia.com $5000 a month in bandwidth and servers.

    Is this the death of OpenX? No. There are still some things that Google Ad Manager can’t do. There is also a bunch of technical weirdos that think Google has too much power, so they will continue to use OpenX out of fear.

    However - Google just flexed their muscle, and they pulled off a great first product. Good work Google.

    And if someone knows how to short OpenX stock, let me know, ;-)

Published by nick on 01 Aug 2008

PHP Performance tip: require versus require_once

One of the big performance oriented complaints with PHP is that it doesn’t do well with large frameworks that have a lot of included files. Symfony and Mediawiki are two that I’ve had this problem with.

Why is it slow to load a lot of files in PHP?

Let’s take a closer look.

Quick note: In this post I’ll assume you are already using a PHP Accelerator, such as APC, or Turk MMCache or eaccelerator. If not, you need to be. My personal pick is APC, mostly because it’s the preferred one at Yahoo!, which has the largest installation of PHP, and the author of PHP works there. The lead maintainer of APC also works there, so I feel good knowing that APC is well supported. There are rumors that PHP 6 will have this accelerator built in, and that it will be based on APC’s code.

With that PHP accelerator plug out of the way, let’s get back to business.

Normally when php does a require to include a file, it does a stat to see if the file has changed, and if not, loads it from the APC cache. Here’s what that looks like at the C level:

 * stat64("./classes/Class1.php", {st_mode=S_IFREG|0644, st_size=2057, ...}) = 0

Tip: Want to know how to look at what code is doing at the C level? Check out this tutorial on using strace to debug web apps

Nice, simple, clean, one stat per file. Note: with APC, you can set apc.stat to off, and this will skip the above stat call as well. The downside: You have to restart apache whenever you change your code.

Now let’s take a look at what happens when you use require_once instead of require:

 * lstat64("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
 * lstat64("/home/webuser", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
 * lstat64("/home/webuser/src", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
 * lstat64("/home/webuser/src/mediawiki", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
 * lstat64("/home/webuser/src/mediawiki/include_test", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
 * lstat64("/home/webuser/src/mediawiki/include_test/classes", {st_mode=S_IFDIR|0755, st_size=20480, ...}) = 0
 * lstat64("/home/webuser/src/mediawiki/include_test/classes/Class1.php", {st_mode=S_IFREG|0644, st_size=2058, ...}) = 0

That’s one stat for each directory. It does this for every single file you include. With require_once, php must call realpath (at the C level) to know what the actual path of the file is. Otherwise, it won’t know if require_once '../../../mydir/Class.php'; is the same as require_once '../mydir/Class.php';

Note that it also must do this for every directory in your include_path, so if you don’t have that set up correctly, this is exacerbated even more. Each one of these stats is a system call that takes time. More work for your servers and slower responses for your users.

Theory: The extra stats required for require_once and include_once introduce a lot of overhead for applications that include a lot of files.

A real world test — At Wikia, we had a common include file that was loading all of our Mediawiki extensions. It had 113 calls to require_once and 172 calls to include_once. By changing these to require and include respectively, the results were significant.

First, strace revealed that there were 2848 syscalls to serve a page, down from 4782, (-40%). Next I went to ab for more testing, and found that the average page request time went down to 36.5, from 46.5ms (-22%), and the server was able to serve 27.1 requests per second, up from 21.4 (+%22) . View the complete output from ab

Conclusion: require_once does not perform as well as require. Don’t use require_once unless you need it. require will save system calls and deliver pages faster to end users. This also applies to the include/include_once counterparts.

It would be great if someone would write up a tool that walked through your code base and made recommendations for these types of performance tweaks. Hmm….