Archive for the 'software' Category

Published by nick on 15 Jun 2009

What software developers can learn from BattleBots

I attended Robogames in San Francisco this past weekend with my 3 sons. We had a great time watching the robots try to kill each other. I let out a Tim Allen grunt at several points, especially when we saw one of the robots with a flame thrower.

I had some time for reflection in between matches, and my pattern-recognition-heavy brain kicked into gear to find out what was similar between the engineering efforts behind robot building and software engineering. Not everything lines up, but I did notice that some of my thoughts of simplicity and robustness carried over to the world of attack robots. In no particular order:

  • Staying moving is more important than having a good weapon. While it did occasionally happen that a kick ass weapon ended a match (and when it did, it was supremely cool!), it was certainly not the majority. Instead of weapons, most of the time the other robot was defeated by getting "stuck" in one way or another, either by being flipped upside down or stuck on an obstacle.

    How does this translate to the software world? If you have a great piece of software, but it doesn’t work, it won’t be successful. Never let any new weapon feature compromise stability and robustness.

  • Simplicity wins - The more moving parts, the more things that can go wrong and break. It was easy to see the robots products that were over-engineered by the uber geeks. I laughed when I saw these robots quickly die because something simple went wrong. My favorite example was a group of "rocket scientists" that built a robot with two ginormous spinning wheels as weapons. Each wheel was turned by an elaborate contraption of gears and pulleys. You know the end of the story. Resist the temptation to over-engineer, and pride yourself on the simplest solution that accomplishes the goals. Or, to quote Albert Einstein,

    Everything should be made as simple as possible, but not simpler.

  • Test, Test, Test - you could really tell a difference between the "mature" robots who had seen a couple of fights, and you could really tell the robots that their owners decide to use them for the first time in the ring. Thorough testing should be done before the robot a product goes into battle the real world
  • Quality construction and materials. This was almost as important as design. Flimsy aluminum and other cheap materials quickly gave way under battle, and poorly engineered welds or connections suffered the same fate. Same goes in software - spaghetti code may look like it’s going to work, but won’t stand up to a fierce battle.

It’s always fun when we can learn from the convergence of two different sciences.

Published by nick on 24 Apr 2009

Why on earth doesn’t javascript have a json_encode?!

Every major language now has tools for JSON encoding a string of text, which is a format that is natively read, understand, and used by Javascript. PHP’s json_encode works great, I use it all the time for data transport. The reasons for using XML are getting harder and harder to come by. JSON is much easier to work with.

Every major language has a built in json_encode, except, Javascript!

WTF? Irony at it’s best.

I’m using Douglas Crockford’s JSON.js in the meantime, but Firefox, Internet Explorer, Safari developers - please include support for this in a future release.

Update Jun 29 2009: I was listened too. :) Firefox 3.1+ and IE 8+ now have native JSON support. More info

Published by nick on 24 Feb 2009

PHP vs Java vs C/C++ for web applications

From Incremental Operations Blog, this call stack shows the layers involved in a [typical?] Java stack.

To be fair, this is not necessarily Java itself, but poorly written Java code. But based on my experience, this type of excessive architecture is accepted best practice in the Java world, and the architecture astronauts seem to gravitate to this technology.

The original pipe dream of Java was "build once, run anywhere". Since that hasn’t exactly materialized, Java is the bastard step child of programming. It doesn’t really fit in anywhere. It is neither high performance nor robust (C/C++), nor easy to program in (PHP/Python/Ruby). It’s awkwardly stuck in the middle, and doesn’t do either well. If you need performance that exceeds native PHP/Python capabilities (rare in the typical work place), use a C/C++ extension for the heavy lifting, and if that’s not enough, your app is at the top 1% of performance demand, and you need to use C/C++ directly.

I’ve heard the defendants of java claim that’s faster than C/C++, but fundamentally that’s not possible, since it’s a layer on top of C/C++. Yes, JIT this and caching that, but these add complexity, which violates my #1 rule of software design, and if you added those same JIT and caching layers to C/C++, they would be even faster.

I will give Java the win in 2 areas of web based programming:

  1. Where development time and performance does not matter, and data integrity is the absolute most important factor. For example, stock trading and banking sites. If I was asked to build E*trade.com, I would use Java and Oracle instead of PHP and MySQL. It would take 5 times longer to do everything, and hardware/software costs would be 10x more, and the web site would be slower, but it would be the most robust solution.
  2. Where development time and performance does not matter, and there are advantages in maintaining advanced "state" information with transactions and rollbacks. For example - online poker sites and ticketmaster.com (advanced reservationing). I’m sure someone’s done that using Ajax, but I wouldn’t trust money flowing over such a system, and I’d recommend Java.

For the other 99.9% of web applications, scripting languages or C/C++ is a better choice, and the complexity that Java introduces is despicable, and in my opinion, making the choice for Java is doing a disservice to your company in terms of cost (both development time and hardware).

Show me a web application that scales well in Java, and I’ll rewrite it for in in PHP in half the time and it will be twice as fast and one more "9″ in availability. If it’s still not fast enough, it needs to be done in C.

I am not always popular with this argument. Quite a few of my developer peers, whom I respect, have strong pro-java arguments. I have a bet with one of these Java ninny’s - I think that Java will be less prevalent in 5 years than it is now, because of it’s excels-at-nothing nature. There is a bottle of expensive tequila riding on this, so I expect to be right. We’ll see. ;-)

Published by nick on 13 Jan 2009

Ultimate vimrc file - Good for php, bash, ruby and others

I’ve built this one up for a few years, and now it’s time to share. Notable features:

  • Syntax highlighting
  • Test compile for syntax errors. This means that every time you write the file, it will check the file for syntax errors and alert you immediately. This saves much back and forth with development. It works with the following languages:
    1. php
    2. bash
    3. perl
    4. httpd.conf
    5. xml
    6. ruby
    7. puppet
    8. javascript (if you have jslint installed)
  • Tab completion of php functions (if you download Rasmus’s function list and put it in your home directory. curl -o ~/.phpfunclist.txt -v http://lerdorf.com/funclist.txt)

    Ok, enough teasing. The ultimate vimrc file can be found here

Published by nick on 23 Sep 2008

Good bye OpenX. Hello Google Ad Manager.

Websites need ads. It’s one of the things that make the internet go ’round. That and porn.

If a startup wants to have a free solution for serving ads, there has really only been one choice for many years, OpenX, formerly known as phpAdsNew. OpenX has been at Wikia for quite some time. After hitting some brick walls with scalability, having downtime/slowness issues, and getting frustrated with basic functionality that work without taking down the server, I decided it was time to try something new.

I looked into Google Ad Manager over the past few days. It seems like it can do the job, and last night I wrote all the code. Today I switched all of the wikia.com websites from OpenX for serving Spotlight Ads to Google Ad Manager.

Here are the compelling reasons I found for switching.

  • OpenX is crap — It is possible to write high scale web applications in PHP/Mysql. I’ve done it, multiple times. OpenX has not. Sorry for being a bit arrogant here, but I will happily engage an OpenX architect and question numerous design decisions. As an example: Logging impressions to a relational database in real time is a horrible idea. Horrible. It will never scale. Telling people that the right way to solve this problem is by logging on the app servers? Even worse.
  • Google’s infrastructure — Even if OpenX wasn’t horrible, I still don’t want to have to worry about buying servers, system administration time, and bandwith for my ad infrastructure. I put more faith in Google’s and Yahoo!’s infrastructure than anything a startup can build.
  • It’s easier to use — I found the interface and code setup far more intutive than OpenX. So have the 4 other people that I’ve been working with to load ads. They love how simple Google Ad Manager is. That being said, there are a couple of less-than-intuitive things with Google Ad Manager, so it wasn’t completely painless. Maybe Apple needs to come out with iAdManager? :)
  • It’s Free — Ok. Did you guys hear that? Free. Free hosting of the graphics. Free server infrastructure. Estimates are that this will save Wikia.com $5000 a month in bandwidth and servers.

    Is this the death of OpenX? No. There are still some things that Google Ad Manager can’t do. There is also a bunch of technical weirdos that think Google has too much power, so they will continue to use OpenX out of fear.

    However - Google just flexed their muscle, and they pulled off a great first product. Good work Google.

    And if someone knows how to short OpenX stock, let me know, ;-)

Published by nick on 01 Aug 2008

PHP Performance tip: require versus require_once

One of the big performance oriented complaints with PHP is that it doesn’t do well with large frameworks that have a lot of included files. Symfony and Mediawiki are two that I’ve had this problem with.

Why is it slow to load a lot of files in PHP?

Let’s take a closer look.

Quick note: In this post I’ll assume you are already using a PHP Accelerator, such as APC, or Turk MMCache or eaccelerator. If not, you need to be. My personal pick is APC, mostly because it’s the preferred one at Yahoo!, which has the largest installation of PHP, and the author of PHP works there. The lead maintainer of APC also works there, so I feel good knowing that APC is well supported. There are rumors that PHP 6 will have this accelerator built in, and that it will be based on APC’s code.

With that PHP accelerator plug out of the way, let’s get back to business.

Normally when php does a require to include a file, it does a stat to see if the file has changed, and if not, loads it from the APC cache. Here’s what that looks like at the C level:

 * stat64("./classes/Class1.php", {st_mode=S_IFREG|0644, st_size=2057, ...}) = 0

Tip: Want to know how to look at what code is doing at the C level? Check out this tutorial on using strace to debug web apps

Nice, simple, clean, one stat per file. Note: with APC, you can set apc.stat to off, and this will skip the above stat call as well. The downside: You have to restart apache whenever you change your code.

Now let’s take a look at what happens when you use require_once instead of require:

 * lstat64("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
 * lstat64("/home/webuser", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
 * lstat64("/home/webuser/src", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
 * lstat64("/home/webuser/src/mediawiki", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
 * lstat64("/home/webuser/src/mediawiki/include_test", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
 * lstat64("/home/webuser/src/mediawiki/include_test/classes", {st_mode=S_IFDIR|0755, st_size=20480, ...}) = 0
 * lstat64("/home/webuser/src/mediawiki/include_test/classes/Class1.php", {st_mode=S_IFREG|0644, st_size=2058, ...}) = 0

That’s one stat for each directory. It does this for every single file you include. With require_once, php must call realpath (at the C level) to know what the actual path of the file is. Otherwise, it won’t know if require_once '../../../mydir/Class.php'; is the same as require_once '../mydir/Class.php';

Note that it also must do this for every directory in your include_path, so if you don’t have that set up correctly, this is exacerbated even more. Each one of these stats is a system call that takes time. More work for your servers and slower responses for your users.

Theory: The extra stats required for require_once and include_once introduce a lot of overhead for applications that include a lot of files.

A real world test — At Wikia, we had a common include file that was loading all of our Mediawiki extensions. It had 113 calls to require_once and 172 calls to include_once. By changing these to require and include respectively, the results were significant.

First, strace revealed that there were 2848 syscalls to serve a page, down from 4782, (-40%). Next I went to ab for more testing, and found that the average page request time went down to 36.5, from 46.5ms (-22%), and the server was able to serve 27.1 requests per second, up from 21.4 (+%22) . View the complete output from ab

Conclusion: require_once does not perform as well as require. Don’t use require_once unless you need it. require will save system calls and deliver pages faster to end users. This also applies to the include/include_once counterparts.

It would be great if someone would write up a tool that walked through your code base and made recommendations for these types of performance tweaks. Hmm….

Published by nick on 20 Jun 2008

Howto - Simple backups for Linux using rsnapshot

Backups are something you must master to be a great system administrator.

You’e probably found this because you were looking for a simple backup solution. Yes, you’ve seen Amanda. And Bacula, but they aren’t simple. Amanda and Bacula are great products if you need all of their features — and if you are like me, I don’t want to spend time with my backups, I just want something that works.

My choice — rsnapshot. rsnapshot is a perl script that wraps around rsync. It’s most beautiful feature: it uses hard-links when it can, so if you are backing up the same file more than once, it just creates a link. This means backups only take up more space if the files change. I’ve heard that this is how Apple’s Time Machine works. I’m now using rsnapshot in multiple production environments. Here’s a quick how-to guide for how to set up reliable, robust, efficient, self-rotating backups in just a few minutes with rsnapshot.

  1. Install rsnapshot, either by downloading/compiling the source code, or using this RPM for linux
  2. Edit /etc/rsnapshot.conf for your settings. Warning: The config file makes a distinction between tabs and spaces. Make sure you use tabs! Pay special attention to these settings:
    1. snapshot_root — this is where your backups will be stored.
    2. The backup section will define which directories you want backed up.
      <br /> backup /etc/ localhost/<br /> backup /root/ localhost/<br /> backup /home/ localhost/<br />
    3. rsnapshot handles the rotation of backups for you! The interval section will define how many daily, weekly, etc. backups are kept. Example:
      <br /> interval hourly 6<br /> interval daily 7<br /> interval weekly 4<br /> interval monthly 3<br />
  3. As a test, you can run: rsnapshot -v -t hourly and this will parse the config file, and show you the commands it will run when it runs hourly.
  4. After you are done tweaking the config file, it’s time to add the crontab entries for the various backups. Scheduling is a bit tricky. Here’s mine as an example:
    <br /> # There is a pid file that will preven two from running at the same time.<br /> # This is why hourly starts after the others. Hourly should be skipped when daily/weekly/monthly is running.<br /> 19 */3 * * * nice rsnapshot -v hourly<br /> 18 1 * * * rsnapshot -v daily<br /> 17 2 * * 0 rsnapshot -v weekly<br /> 16 3 1 * * rsnapshot -v monthly<br />

That’s it! You’re ready to go. Your backups will be stored in rsnapshot_root.

Published by nick on 02 Jun 2008

Debugging web apps with strace

Want to be an advanced debugger? My #1 Superman debugging tool is Linux’s strace. If you have ever run into problems where a user complains that the site is slow, and you can’t figure out why, you may want to give strace a try.

From http://sourceforge.net/projects/strace/:

strace is a system call tracer, i.e. a debugging tool which prints out a trace of all the system calls made by a another process/program.

In other words, strace tells you what a program is doing, at the C function call level. This is great for finding the problems where a page just "hangs" for no apparent reason. Let’s walk through what it takes to set up strace on Apache in a LAMP environment, with some real world examples that I’ve run into.

First, you’ll need to install strace, if it isn’t already installed. My favorite method is just yum install strace, but if you want to, you can download and compile it yourself.

Next, you will need a place where you can test the slow page. For the rest of this article, we will assume you have a development environment that is all to your own, where you can start/stop Apache at will, and no one else will be using it. Note: If a separate development environment isn’t available, I suggest running another Apache on a different port, say 81 instead of 80. This way you can still work on the production site without affecting end users.

Environment set up? Good. Let’s get down to debugging.

  1. Start Apache in "Debug Mode" with the -X option. This has Apache start one process, instead of a bunch of children, and then all the requests will go through one process.

    httpd -X

  2. In another terminal window, find the process id for the listening Apache that you just started. ps auxw | grep httpd should do the trick.
  3. Once you have the process id, attach strace with the -p option:

    strace -p $processidofapacheprocess

  4. Go to your browser and go to the url that is hanging. While it is running, watch the output from strace in your terminal window. You’ll see a ton of system calls stream by, but the important thing to look for is when it stops. What is it doing?

I’ve used this approach to find several "Superman" level problems (problems that other people spent at least a day trying to figure out what was going on — sometimes weeks). Here are some examples.

  1. Sendmail hanging via PHP - The reported problem was that certain pages were slow (30-300 seconds). Load on the machines seemed fine, but certain requests were painfully slow. strace revealed that the PHP script was waiting for sendmail to come back with a response. Upon looking further, sendmail was doing a reverse dns lookup that was timing out, which resulted in a 30+ second delay. Problem resolved by reconfiguring sendmail.
  2. PHP pages slow on an NFS server - The reported problem was a development environment with pages that were slow to load. strace revealed that the pages were hanging at a flock call to a directory that was mounted via NFS. Here’s the actual output from strace:

    …pages of output snipped…
    fcntl(24, F_SETFL, O_RDWR) = 0
    sendto(24, "incr toys:stats:request_with_ses"…, 40, MSG_DONTWAIT, NULL, 0) = 40
    poll([{fd=24, events=POLLIN|POLLERR|POLLHUP, revents=POLLIN}], 1, 500) = 1
    recvfrom(24, "76\r\n", 8192, MSG_DONTWAIT, NULL, NULL) = 4
    open("/home/phpsessions/sess_079113645a3da0fe50f68e4ce6ed58d2″, O_RDWR|O_CREAT, 0600) = 25
    flock(25, LOCK_EX

    So we can see here that the file /home/phpsessions/sess_079113645a3da0fe50f68e4ce6ed58d2 has been opened, and the flock call is hanging. Turns out NFS doesn’t deal well with flock. When we saw this, there was a big smack on the forehead. Why on earth were the sessions being stored via NFS anyway? Especially for a development server, where only one box needed to store it. To solve the problem, we changed the session.save_path in the php configuration file to a directory that was not on NFS.

  3. Memcached hanging - Again, certain requests were hanging, causing pages to be slow to load. Again, strace to the rescue! Turns out PHP was hanging when talking to memcached. Once this was determined, we also ran strace on memcached, and found a bug with the particular memcached client we were using via PHP. We upgraded the memcached client to the latest version, and the problem was solved.

In all of the above cases, the problem could have been found through other means, but strace made it a much easier and faster to figure out where the slowdowns were.

There are other helpful uses of strace. In addition to finding hanging web pages, I’ve also used strace to find why/where Apache was segfaulting. Just run strace and look to see what the last thing it did. It should give you an indication of why the script stopped when it did.

Also, I’ve used Apache as a troubleshooting tool to find out where most of the time is being spent by analyzing the entire request.

Good luck in your adventures with strace, it’s been a big help for me. Feel free to leave a comment with your findings.

Published by nick on 25 Apr 2008

Debugging rules

My co-worker, Artur Bergman, gave a talk today at the Web 2.0 Expo.

He highlighted these rules of debugging, and they were great, so I wanted to share.

  1. Understand the system
  2. Make it fail
  3. Quit thinking and look
  4. Divide and conquer
  5. Change one thing at a time
  6. Keep an audit trail
  7. Check the plug
  8. Get a fresh view
  9. If you didn’t fix it, it ain’t fixed.

Brilliant! See http://www.debuggingrules.com/

Published by nick on 24 Mar 2008

Code reviews before every commit?

First, a little cartoon about measuring code quality.

How true this is! See the original article here:
http://www.osnews.com/story/19266/WTFs_m

All developers talk about code reviews, but most of the time, it’s one of those things that there is never enough time for, and always takes a back seat to other priorities, which means it’s easily left off.

I work part time at Wikinvest.com, and we have a policy that requires that all svn commits are code reviewed first. When this policy was first introduced, I thought that this policy was Draconian and would slow things down.

I kicked and screamed inside, thinking that it was unnecessary and was going to hamper productivity. I was told that this was common practice everywhere, which I knew wasn’t true. I went along — begrudgingly. Since I’m part time, it’s not my place to try to shape development policies.

However, now that I’ve done this for a few months, I’ve come to feel differently. Here are the advantages I’ve found in having every commit code reviewed.

  1. Less bugs — this is obvious. More mistakes are caught. The live site has less bugs. Duh. The evidence: Wikinvest.com has less live site bugs than any place I’ve worked. The time savings for this benefit alone more than make up for the total time spent fixing bugs.
  2. Higher initial code quality — Why? Hacks will be reduced. If you know that your code is going to be reviewed by someone else, you’ll take that extra time to write it well. Your name and reputation is on the line, and it will be presented to someone else for review. You know this as you are writing the code, and you’ll be encouraged to treat the project like a high quality craftsman would. Comments will be clearer. Edge cases will be more thought through. Logical flow will be enhanced. In general, developers will take more pride in their work.
  3. Enforced cross training — Every new piece of functionality, feature or fix has at least one other person that has seen it, and taken the time to understand it. This insures better code coverage across the team in case someone leaves or goes on vacation. Development managers — this will solve one of your biggest problems! When someone changes something on their own without letting anyone know, and the live site breaks, now you have at least one other person that might say "I think I know what it might be."
  4. Developers learn from each other — I can attest that my code quality has improved because of code reviews. I’ve learned better practices, both by seeing others code closely, and by having others critique my code. I’ve been in several code reviews where I’ve told people about a better approach because I knew about some special function in PHP or I had previously solved the problem in a more elegant way. Big thumbs up here.
  5. Coding standards and conventions are better enforced — Even with the best intentions, we can miss the way the rest of the group is doing something. Perhaps we shouldn’t have that particular variable hard coded in the script, because someone else has already moved it to a configuration file. Or perhaps there is a new standard way for getting a database connection, but you missed the e-mail that announced the new practice. Code review to the rescue! It’s always good to go along with what the group is doing and adopt standards. Get two sets of eyes on the code to make sure.

When you weigh all the above advantages, the extra time spent code reviewing for every commit is easily offset by all the above benefits, and I’m sold that this is the right approach.

Here are some best practices to consider if you are implementing in your development environment.

  1. Set up a process for getting the reviews done and commits to happen quickly.
  2. Spread it around! Everyone should review everyone else’s code. Discourage silos. Remember, one of the key benefits is to have developers learn from each other. If two buddies always review each others code, you aren’t getting all the benefits from spreading knowledge across the team.
  3. Discourage any exceptions. The only exceptions are "trivial" changes like spacing or comments.
  4. The svn commit message should contain the name of the person that did the code review.

Instituting code reviews before every commit won’t be a popular approach for everyone. People will kick and scream, like I did. The chief argument will probably be time. "It will slow us down too much". Yes, commits will be slower. But overall development will be much more efficient, and there will be less bugs. Would you rather spend less time on code reviews now or more later on bugs?

In doubt? Give it a try and see what happens to the code quality and developer engagement.

Next »

Optimize your ads with Liftium.com