Archive for the 'software' Category

Published by nick on 20 Jun 2008

Howto - Simple backups for Linux using rsnapshot

Backups are something you must master to be a great system administrator.

You’e probably found this because you were looking for a simple backup solution. Yes, you’ve seen Amanda. And Bacula, but they aren’t simple. Amanda and Bacula are great products if you need all of their features — and if you are like me, I don’t want to spend time with my backups, I just want something that works.

My choice — rsnapshot. rsnapshot is a perl script that wraps around rsync. It’s most beautiful feature: it uses hard-links when it can, so if you are backing up the same file more than once, it just creates a link. This means backups only take up more space if the files change. I’ve heard that this is how Apple’s Time Machine works. I’m now using rsnapshot in multiple production environments. Here’s a quick how-to guide for how to set up reliable, robust, efficient, self-rotating backups in just a few minutes with rsnapshot.

  1. Install rsnapshot, either by downloading/compiling the source code, or using this RPM for linux
  2. Edit /etc/rsnapshot.conf for your settings. Warning: The config file makes a distinction between tabs and spaces. Make sure you use tabs! Pay special attention to these settings:
    1. snapshot_root — this is where your backups will be stored.
    2. The backup section will define which directories you want backed up.
      <br /> backup /etc/ localhost/<br /> backup /root/ localhost/<br /> backup /home/nick/ localhost/<br />
    3. rsnapshot handles the rotation of backups for you! The interval section will define how many daily, weekly, etc. backups are kept. Example:
      <br /> interval hourly 6<br /> interval daily 7<br /> interval weekly 4<br /> interval monthly 3<br />
  3. As a test, you can run: rsnapshot -v -t hourly and this will parse the config file, and show you the commands it will run when it runs hourly.
  4. After you are done tweaking the config file, it’s time to add the crontab entries for the various backups. Scheduling is a bit tricky. Here’s mine as an example:
    <br /> # There is a pid file that will preven two from running at the same time.<br /> # This is why hourly starts after the others. Hourly should be skipped when daily/weekly/monthly is running.<br /> 19 */3 * * * nice rsnapshot -v hourly<br /> 18 1 * * * rsnapshot -v daily<br /> 17 2 * * 0 rsnapshot -v weekly<br /> 16 3 1 * * rsnapshot -v monthly<br />

That’s it! You’re ready to go. Your backups will be stored in rsnapshot_root.

Published by nick on 02 Jun 2008

Debugging web apps with strace

Want to be an advanced debugger? My #1 Superman debugging tool is Linux’s strace. If you have ever run into problems where a user complains that the site is slow, and you can’t figure out why, you may want to give strace a try.

From http://sourceforge.net/projects/strace/:

strace is a system call tracer, i.e. a debugging tool which prints out a trace of all the system calls made by a another process/program.

In other words, strace tells you what a program is doing, at the C function call level. This is great for finding the problems where a page just “hangs” for no apparent reason. Let’s walk through what it takes to set up strace on Apache in a LAMP environment, with some real world examples that I’ve run into.

First, you’ll need to install strace, if it isn’t already installed. My favorite method is just yum install strace, but if you want to, you can download and compile it yourself.

Next, you will need a place where you can test the slow page. For the rest of this article, we will assume you have a development environment that is all to your own, where you can start/stop Apache at will, and no one else will be using it. Note: If a separate development environment isn’t available, I suggest running another Apache on a different port, say 81 instead of 80. This way you can still work on the production site without affecting end users.

Environment set up? Good. Let’s get down to debugging.

  1. Start Apache in “Debug Mode” with the -X option. This has Apache start one process, instead of a bunch of children, and then all the requests will go through one process.

    httpd -X

  2. In another terminal window, find the process id for the listening Apache that you just started. ps auxw | grep httpd should do the trick.
  3. Once you have the process id, attach strace with the -p option:

    strace -p $processidofapacheprocess

  4. Go to your browser and go to the url that is hanging. While it is running, watch the output from strace in your terminal window. You’ll see a ton of system calls stream by, but the important thing to look for is when it stops. What is it doing?

I’ve used this approach to find several “Superman” level problems (problems that other people spent at least a day trying to figure out what was going on — sometimes weeks). Here are some examples.

  1. Sendmail hanging via PHP - The reported problem was that certain pages were slow (30-300 seconds). Load on the machines seemed fine, but certain requests were painfully slow. strace revealed that the PHP script was waiting for sendmail to come back with a response. Upon looking further, sendmail was doing a reverse dns lookup that was timing out, which resulted in a 30+ second delay. Problem resolved by reconfiguring sendmail.
  2. PHP pages slow on an NFS server - The reported problem was a development environment with pages that were slow to load. strace revealed that the pages were hanging at a flock call to a directory that was mounted via NFS. Here’s the actual output from strace:

    …pages of output snipped…
    fcntl(24, F_SETFL, O_RDWR) = 0
    sendto(24, “incr toys:stats:request_with_ses”…, 40, MSG_DONTWAIT, NULL, 0) = 40
    poll([{fd=24, events=POLLIN|POLLERR|POLLHUP, revents=POLLIN}], 1, 500) = 1
    recvfrom(24, “76\r\n”, 8192, MSG_DONTWAIT, NULL, NULL) = 4
    open(”/home/phpsessions/sess_079113645a3da0fe50f68e4ce6ed58d2″, O_RDWR|O_CREAT, 0600) = 25
    flock(25, LOCK_EX

    So we can see here that the file /home/phpsessions/sess_079113645a3da0fe50f68e4ce6ed58d2 has been opened, and the flock call is hanging. Turns out NFS doesn’t deal well with flock. When we saw this, there was a big smack on the forehead. Why on earth were the sessions being stored via NFS anyway? Especially for a development server, where only one box needed to store it. To solve the problem, we changed the session.save_path in the php configuration file to a directory that was not on NFS.

  3. Memcached hanging - Again, certain requests were hanging, causing pages to be slow to load. Again, strace to the rescue! Turns out PHP was hanging when talking to memcached. Once this was determined, we also ran strace on memcached, and found a bug with the particular memcached client we were using via PHP. We upgraded the memcached client to the latest version, and the problem was solved.

In all of the above cases, the problem could have been found through other means, but strace made it a much easier and faster to figure out where the slowdowns were.

There are other helpful uses of strace. In addition to finding hanging web pages, I’ve also used strace to find why/where Apache was segfaulting. Just run strace and look to see what the last thing it did. It should give you an indication of why the script stopped when it did.

Also, I’ve used Apache as a troubleshooting tool to find out where most of the time is being spent by analyzing the entire request.

Good luck in your adventures with strace, it’s been a big help for me. Feel free to leave a comment with your findings.

Published by nick on 25 Apr 2008

Debugging rules

My co-worker, Artur Bergman, gave a talk today at the Web 2.0 Expo.

He highlighted these rules of debugging, and they were great, so I wanted to share.

  1. Understand the system
  2. Make it fail
  3. Quit thinking and look
  4. Divide and conquer
  5. Change one thing at a time
  6. Keep an audit trail
  7. Check the plug
  8. Get a fresh view
  9. If you didn’t fix it, it ain’t fixed.

Brilliant! See http://www.debuggingrules.com/

Published by nick on 24 Mar 2008

Code reviews before every commit?

First, a little cartoon about measuring code quality.

How true this is! See the original article here:
http://www.osnews.com/story/19266/WTFs_m

All developers talk about code reviews, but most of the time, it’s one of those things that there is never enough time for, and always takes a back seat to other priorities, which means it’s easily left off.

I work part time at Wikinvest.com, and we have a policy that requires that all svn commits are code reviewed first. When this policy was first introduced, I thought that this policy was Draconian and would slow things down.

I kicked and screamed inside, thinking that it was unnecessary and was going to hamper productivity. I was told that this was common practice everywhere, which I knew wasn’t true. I went along — begrudgingly. Since I’m part time, it’s not my place to try to shape development policies.

However, now that I’ve done this for a few months, I’ve come to feel differently. Here are the advantages I’ve found in having every commit code reviewed.

  1. Less bugs — this is obvious. More mistakes are caught. The live site has less bugs. Duh. The evidence: Wikinvest.com has less live site bugs than any place I’ve worked. The time savings for this benefit alone more than make up for the total time spent fixing bugs.
  2. Higher initial code quality — Why? Hacks will be reduced. If you know that your code is going to be reviewed by someone else, you’ll take that extra time to write it well. Your name and reputation is on the line, and it will be presented to someone else for review. You know this as you are writing the code, and you’ll be encouraged to treat the project like a high quality craftsman would. Comments will be clearer. Edge cases will be more thought through. Logical flow will be enhanced. In general, developers will take more pride in their work.
  3. Enforced cross training — Every new piece of functionality, feature or fix has at least one other person that has seen it, and taken the time to understand it. This insures better code coverage across the team in case someone leaves or goes on vacation. Development managers — this will solve one of your biggest problems! When someone changes something on their own without letting anyone know, and the live site breaks, now you have at least one other person that might say “I think I know what it might be.”
  4. Developers learn from each other — I can attest that my code quality has improved because of code reviews. I’ve learned better practices, both by seeing others code closely, and by having others critique my code. I’ve been in several code reviews where I’ve told people about a better approach because I knew about some special function in PHP or I had previously solved the problem in a more elegant way. Big thumbs up here.
  5. Coding standards and conventions are better enforced — Even with the best intentions, we can miss the way the rest of the group is doing something. Perhaps we shouldn’t have that particular variable hard coded in the script, because someone else has already moved it to a configuration file. Or perhaps there is a new standard way for getting a database connection, but you missed the e-mail that announced the new practice. Code review to the rescue! It’s always good to go along with what the group is doing and adopt standards. Get two sets of eyes on the code to make sure.

When you weigh all the above advantages, the extra time spent code reviewing for every commit is easily offset by all the above benefits, and I’m sold that this is the right approach.

Here are some best practices to consider if you are implementing in your development environment.

  1. Set up a process for getting the reviews done and commits to happen quickly.
  2. Spread it around! Everyone should review everyone else’s code. Discourage silos. Remember, one of the key benefits is to have developers learn from each other. If two buddies always review each others code, you aren’t getting all the benefits from spreading knowledge across the team.
  3. Discourage any exceptions. The only exceptions are “trivial” changes like spacing or comments.
  4. The svn commit message should contain the name of the person that did the code review.

Instituting code reviews before every commit won’t be a popular approach for everyone. People will kick and scream, like I did. The chief argument will probably be time. “It will slow us down too much”. Yes, commits will be slower. But overall development will be much more efficient, and there will be less bugs. Would you rather spend less time on code reviews now or more later on bugs?

In doubt? Give it a try and see what happens to the code quality and developer engagement.

Published by nick on 09 Jan 2008

Checking for connection speed in Linux

Sure, you have a Gigabit card, but how can you check to see what speed you are running at in Linux? All your networking equipment must be Gigabit for it to work. There are two tools to do this, /usr/sbin/ethtool and /sbin/mii-tool

sudo /usr/sbin/ethtool eth0
Password:
Settings for eth0:
Supported ports: [ MII ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Half 1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Half 1000baseT/Full
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: g
Wake-on: d
Current message level: 0×000000ff (255)
Link detected: yes

Enjoy.

Published by nick on 04 Jan 2008

Ultimate Squid Init Script for Stop Start Restart

 I took some time to improve upon someone else’s squid init script for Linux. I’m sharing it here:

    http://www.techyouruniverse.com/squid-init-script/

   Notable improvements: 

  • Added clearcache method for clearing the squid cache. Surprisingly, this isn’t built into squid, and when you restart squid, you only clear the cache that is in memory. This clears the cache that is on disk too. Usage:
    /etc/rc.d/init.d/squid clearcache      

  • Added configtest method for checking syntax of config file, just like the one for httpd.
  • Optimized restart/stop for minimum downtime. Notably, the stop was set to allow http connections to die gracefully. I wasn’t happy with this because it caused 10 - 15 seconds of downtime whenever squid was restarted. Changed to use squid -k shutdown instead of squid -k interrupt, which closes the connections immediately instead of waiting for them to finish. Shut down takes 1-2 seconds now instead of 10-15.
  • Cleaned up config variables
  • Cleaned up formatting of output

This script has been used in production for a while, but if you find any issues, please leave a comment so that I can address them. Other comments/ suggestions appreciated. By the way, if you are the original author of the script, I’ll gladly give credit, leave a comment.

Published by nick on 29 Oct 2007

Leopard Success

For anyone waiting to see how the early adopters do, I went to the Mac OSX Leopard Release at the Apple Store in San Francisco and bought myself a Family Pack. I’m happy to report that Leopard upgrade was successful on 3 different computers now.

  • 1 Macbook Pro
  • 1 Ibook G4
  • 1 Imac G4

Very stable on all 3, and it seems faster to me. Maybe this is just the “fresh install” effect though.Important software that I can say works without any compatibility issues:

  • Quicksilver
  • Microsoft Office
  • Cisco VPN
  • Firefox
  • Camino
  • Adium
  • smc fan control

My key “wow” features:

  1. Spaces kicks ass. This was my biggest reason for upgrading.
  2. Mail 3.0 support “todo list” that are stored on the IMAP server, so I can access them from anywhere.
  3. Safari 3.0 has the coolest “find” feature I’ve ever seen in a browser.

Published by nick on 26 Oct 2007

Installing Trac with subversion on Cent OS 5

Trac is a decent bug tracking software that is getting a lot of attention these days. It’s very simple to use, which I like. It includes subversion integration so that you can walk through your source code tree and view diffs from one version to another. Handy.

However, Trac’s installation process has a lot of room for improvement in the simplicity department. Here are the steps I went through to install trac on Cent OS 5. I captured them so that others may find it easier to get Trac up and running. Enjoy.

If these instructions work for you, please leave a comment! If they don’t, please leave a comment!

  1. Install python and it’s goodies.
    1. Use yum to install the base package: yum install python
    2. Install mod_python. This is so Apache can run the python scripts as a module:
      yum install mod_python
    3. Install MySQL-python so that python can interact with mysql
      1. Download and untar MySQL-python from http://sourceforge.net/projects/mysql-python
      2. cd $mysqlpythonsourcedir
      3. python setup.py build && python setup.py install
  2. Install some devel packages that are needed for compiling trac and svn integration:
    yum install neon neon-devel python-devel swig
  3. Install Clearsilver, a templating package that trac depends on.
    1. Download and untar clearsilver from http://www.clearsilver.net/downloads/
    2. Compile cd $clearsilversourcedir; ./configure && make && make install
  4. Install trac!
    1. Download and untar trac from http://trac.edgewall.org/wiki/TracDownload
    2. cd $tracsourcedir; python ./setup.py install
    3. Initialize your first trac project:
      trac-admin $pathtoyourtracproject initenv
    4. Tell Apache about trac.. Add the following lines to your Apache configuration file:
        <Location />
          SetHandler mod_python
          PythonHandler trac.web.mod python_frontend
          PythonOption TracEnv $pathtoyourtracproject
          PythonOption TracUriRoot /trac/
        </Location>
      

Finally, restart httpd and try it out by going to:

http://$yourhost/trac/

If it doesn’t work, check in your webserver error log for errors:
tail -f /var/log/httpd/error_log

Again, please leave a comment so that you may help others with the same problems you have.

Published by nick on 17 Oct 2007

mysql replication - always have more than one slave

Inevitably mysql replication gets borked. Some sql statement will fail, which stalls mysql replication. While this can be mitigated by restricting access on the slave, anyone who has used mysql replication in production knows that it can, and will happen. So plan for it.

If you see something like this when you run SHOW SLAVE STATUS Tip: Use the \G instead of the semicolon at the end of SQL statements, it produces more readable output.

You can try to skip the failed statement with:

mysql>STOP SLAVE();
mysql>SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1;
mysql>START SLAVE();

But that can’t solve all problems, as that sql statement you just skipped may have been needed to keep the data in sync. So what do you do? It’s time to reset the data on the slave. That’s right, wipe it and grab a fresh copy from the master. It’s the only way.

To grab a fresh copy of the data from the master, here you have several options.

  1. Run a mysqldump on the master with –master-data and reimport on slave. Note that while the mysqldump is running on the master, tables are locked and the database is unusable
  2. Stop mysql on the master and take a snapshot of all the data using tar or a filesystem snapshot
  3. Use ibbackup. Note that ibbackup costs money and it will only work for InnoDB files, so you have to do something else for your MyISAM tables.

The problem with all of these solutions? They require downtime on the master database. That may be fine for some situations, but everywhere I’ve worked having downtime on the master data isn’t a good thing. So this is where you wish that you had thought ahead, and you had another slave that you could grab the master snapshot from. Aha!

Always have more than one slave. Put it on a piece of crap box some where. It’s another copy of your data for redundancy, and it gives you another place besides the master to grab the data from in the event that you need a snapshot of the data.

How do you rebuild a slave from another slave you ask? I’ll write up a step by step guide some day, but here’s a rough overview off the top of my head:

  1. Stop mysql on the working slave
  2. tar up all the files. This is everything in the “data” directory and the ibdata* files if you are using InnoDB.
  3. Make a copy of the master.info file
  4. Start mysql on the working slave
  5. Stop mysql on the broken slave.
  6. Untar all the data files.
  7. Edit the my.cnf on the broken slave, and add “skip_slave_start=1″ in the [mysqld] section.
  8. Start mysql on broken slave. Note that replication won’t be running yet because of the skip_slave_start line you added in the previous step.
  9. Get the position from the master.info, and use “CHANGE MASTER TO” to reset the log file position on the broken slave to match the one on the working slave
  10. Run START SLAVE on the working slave, check slave status with SHOW SLAVE STATUS to see if it worked.
  11. Remove the skip_slave_start from the my.cnf file so the slave will start next time mysql is restarted.

Published by nick on 25 Sep 2007

JSON vs XML vs serialize() for data

Who is this Jason guy and what does he want with our data? JSON stands for JavaScript Object Notation, and I think there are some pretty compelling reasons to use it all the time instead of php’s serialize() function, and maybe even to replace XML under some circumstances.

Why use JSON instead of XML for Asynchronous Javascript requests for your favourite web application? Well, why not? After all, it’s the simplest approach.

We’re going to build a widget for looking up the cities within 5 miles of a user supplied zip code. We want the user to enter their zip code into a form, and without reloading the page, use Javascript to go get data from our web service, and display it on the page. Let’s look at using JSON for the data type verses using XML.

JSON for web service

Server side:


$locations=getLocationsForZip($zip);
echo json_encode ($locations);

Client side:


var json=fetch(url);
var locations=eval(json);

XML approach

Server side:


$locations=getLocationsForZip($zip);
if (!empty($locations)){
  echo '<?xml version="1.0" encoding="utf-8"?>' . "\n";
  echo "<locations>\n";
  foreach ($locations as $location){
    echo "<location>\n";
    echo "<city>" . htmlspecialchars(utf8_encode($city)) . "</city>\n";
    echo "<state>" . htmlspecialchars(utf8_encode($state)) . "</state>\n";
    echo "<country>" . htmlspecialchars(utf8_encode($country)) . "</country>\n";
   echo "</location>\n";
  }
  echo "</locations>\n";
}

Client side:


var xml=fetch(url);
// TODO: write nasty code for parsing XML into DOM
// More nasty code to iterate through the DOM object it to put it into a usable array.

Now, that makes the most sense for Asynchronous Javascript requests, but going further, does it make sense to do it even for normal data transport mechanisms?

Language Neutral Data Storage

We need to store data on the file system or in a database that needs to be programming language independent. Historically, XML has been the obvious choice for this task. I’ve seen some people use serialized PHP (yuck). You could roll your own pipe delimited or CSV kludge. I think JSON is a best choice.

  1. JSON is higher performance than XML, both in construction and parsing. Those of us who have [tried] to build scalable applications using XML/XSL have learned… not to.
  2. JSON is language neutral, and built in! Every major language now has a json encode/decode capabilities.
  3. JSON is simpler. Just run json_encode() via PHP, and then ‘eval’ in javascript, and you’re done. See above examples.
  4. JSON is more compact than XML. Less data on the disk, less data over the wire.
  5. JSON contains character set information, and handles encoding issues for you (for the most part) — PHP’s serialize does NOT–, and this will cause problems for you when you internationalize
  6. JSON maintains structure and objects, unlike pipe delimited or CSV hacks.

The more I use JSON for building my web applications, the more I’m finding that my reasons for using XML are fading away. Fast. XML is more difficult to deal with.

If you really feel compelled to use XML, do the rest of us a favor and make your webservice support both formats. Tip: Build a RESTful webservice and allow for a .json extension in your url in addition to .xml.

Props to Chris Cowan for helping me to see the light, and Douglas Crockford for evangelizing the use of Javascript.

-Nick

Next »