Tuesday, November 27, 2012

5 Tips For Faster Loading Web Sites

Recently I came across Aaron Hopkins' Optimizing Page Load Time article. It explains in depth how you should optimize your web pages for a faster browsing experience. It's full of brilliant theory, but not to so much about practical tips for the average blogger. So here is a list of 5 things you can do to optimize your web pages without having to redo your site design or set up distribution caches around the planet. It helped me to reduce load time of this web site by about 70% on average.
Some of these tips require some control over your web server, like being able to edit .htaccess or even the Apache server config. If so, they will be marked accordingly. Others can be done with every shared hosting plan.

Stylesheets and Javascripts

Hopkins states that many small external objects like images or stylesheets hurt your performance more then just a few big objects with the same total size. This is due mostly to the overhead created by multiple TCP connections but also by the fact, that the average user is connected with DSL and has a decent download rate but only limited upload bandwidth. Thus, many requests from the browser for small objects create a bottleneck in the upstream. Unless you serve static content from a subdomain, cookies are sent to the server with every request - even for static files.
But it's even worse. While up to four images are loaded in parallel in most browsers, stylesheets and javascripts are not. The browser waits until the first stylesheet has finished loading, then requests the second one and so on. In my tests, Firefox didn't actually load any images before all stylesheets and javascripts were done loading. A test with a simple HTML page over a DSL connection shows that one big stylesheet of 50 KB can speed up load time by factor 2 compared to five stylesheets that are 10 KB each in size: (look further down in this article for a detailed chart)
  • 5 Stylesheets (10KB each): 1100ms
  • 1 Stylesheet (50KB): 500ms
So here's my first tip: use one single stylesheet and javascript file instead of many small ones. I know one big file is harder to maintain, but once your site goes into production, you shouldn't have to change these files often anyway. Some sites like slashdot.org or digg.com reference dozens of .css and .js from their front page. Slashdot takes about twenty seconds for me to load from a clear browser cache. Don't go this mad with your css unless you have enough regular readers already. For someone new to your site coming over from Google this will be a major turn off.
If your stylesheets are static files and you would like to keep them seperate for better maintenance, you can bundle them into one request dynamically:
<?php # File css.php readfile("stylesheet1.css"); readfile("stylesheet2.css"); ?>
As Kevin Palms points out, you will have to set the header information manually in PHP like this:
# File css.php header('Content-type: text/css'); # ...
Save this code as a file called something like css.php and reference it from your HTML:
<link rel="stylesheet" type="text/css" href="/css.php" />

Caching is your friend

Needs Apache module mod_expires and .htaccess Many webmasters don't like the fact that their pages are being cached because they fear of loosing control over their content and not being able to track statistics. They put meta statements in the head of their HTML documents that tell proxies and browsers not to cache itself. But caching works on a lower level. It would be silly to download and read the file first to know wether or not the cached version should have been used. Proxies and browsers will always try to cache your pages, regardless of this. By setting up decent rules for caching with HTTP headers, you can at least gain some control over it.
You need to have Apache's mod_expires on your server for this to work. If you have access to your servers config files, check if the following line is commented out in the load modules section:
LoadModule expires_module modules/mod_expires.so
In your .htaccess or preferably virtual host container insert something like this:
ExpiresActive On ExpiresByType text/html "access plus 30 seconds" ExpiresByType text/css "access plus 2 weeks" ExpiresByType text/javascript "access plus 2 weeks" ExpiresByType image/png "access plus 1 month"
Modify to your needs. For every file type you would like to cache, insert an extra statement. For my server, I don't want text/html to be cached for long, because they are dynamic anyway and I want to see how often pages are requested. I use compression for these files, but we'll talk about that later.
If your files have changed, use a new file name. You can trick browsers into thinking the file URL has changed by adding a useless query string to static files like this: "stylesheet.css?new". If your weblog features some kind of HTTP header manipulation for caching, you should turn it off so it doesn't interfere with your settings. I wasted much time trying to figure out why caching on a Textpattern website was not working right until I noticed a setting called "Send Last-Modified Header" was turned on in the admin preferences. It seems that Textpattern was forcing the Last-Modified header from within PHP for pages like css.php?n=default which resulted in unnecessary conditional GET requests. If you set a decent expiration date for your files, there's no need for conditional GET!
For more information about caching, head over to Mark Nottingham's Caching Tutorial.

Compress text files

Needs Apache 2, mod_deflate, mod_headers and access to server config. So now your static content is being cached properly, but it still needs to be downloaded once at least. Also there's your dynamic html. Compression can save you a huge amount of bandwidth and deliver pages faster. In the above example about stylsheets, compression decreased download size from 50KB to about 13KB. Here is a complete graph for the same test run again with and without compression for an uncompressed download size of ~50KB total:
Page load time chart This will not only save bandwidth, but also make pages load faster. Because pages are served faster, your server will be able to deliver more pages in the same time. The difference might be even bigger for users with low bandwidth. Keep in mind, that a slow client on dial up from New Zealand will keep one Apache child process (or thread, depending on your MPM) busy, while it is downloading the content.
To make this happen, you will need to have two Apache modules loaded:
LoadModule headers_module modules/mod_headers.so LoadModule deflate_module modules/mod_deflate.so
The following directives don't work in .htaccess. You need to place them in your servers config, i.e your virtual host container:
# Compress some text file types
AddOutputFilterByType DEFLATE text/html text/css text/xml application/x-javascript

# Deactivate compression for buggy browsers BrowserMatch ^Mozilla/4 gzip-only-text/html BrowserMatch ^Mozilla/4\.0[678] no-gzip BrowserMatch \bMSIE !no-gzip !gzip-only-text/html
# Set header information for proxies Header append Vary User-Agent
The first line says what file types should be compressed. Add the uncompressed file types you host on your server. It makes no sense to compress already compressed files like images or archives. Compressing these files again will only increase your server's load. The next three lines just exclude some know buggy browsers from the benefits of compression. The last line tells proxies not to deliver the same content to all user agents, because some of them just can't handle it.
The downside of mod_deflate is the CPU time consumption. You may have to tweak the compression level to meet your needs:
# n = 1..9 with 9 being the highest compression level. Standard is 6.
DeflateCompressionLevel n
For more in depth information see the Apache 2 mod_deflate Benchmark by Frank Schoep.

Compress your images the right way

Saving your images with Photoshop and "save for web" doesn't mean they are optimized. There is more to image optimization than saving photos as Jpeg and navigation elements as PNG. Even lossless formats like PNG can often be compressed to 50% of it's original size without quality loss if you use the right software.
I have successfully used the freely available OptiPNG. OptiPNG compresses PNG, GIF and other formats. It comes with source code for Unix or as a precompiled binary for Windows. Build instructions for Unix are found in the README. It is a command line tool and can be run with wildcards to automatically optimize all images in a given directory.

Apache fine tuning

Needs access to .htaccess or server config If you have access to your servers configuration, there is some more settings you can play with. For one, disable host name lookups. It will safe time on every request. You will not have host names of clients available in CGI scripts and access_log, but there is better tools for that anyway in post processing of your log files. Preferably in your server config or on a directory level in your .htaccess turn it off like this:
HostnameLookups Off
In one last step, configure persistent connections in Apache's config. Persistent connections let the browsers request multiple files with one TCP connection, thus reducing overhead for TCP handshakes and speeding up requests a lot. These are the standard settings with Apache 2.2:
KeepAlive On MaxKeepAliveRequests 100 KeepAliveTimeout 15
The standards are fine actually, but it's a good idea to experiment with these under your server's regular load conditions. MaxKeepAliveRequests is the maximum number of requests in one persistent connection. More is better, but bad spambots that request many pages at once may suck up all available connections if that number is too high.
KeepAliveTimeout is the timeout after which a persistent connection is dropped. The optimal value here depends on your server load and the clickyness of your web site. If visitors of your web site make many clicks in a short time, you can tweak that value to let one visitor always click within one persistent connection. Note that this is only advisable if your overall traffic is rather low. If you expect high peaks in traffic, don't set this value too high because you might end up having no connections left to serve new incoming requests.

No comments:

Post a Comment