Website Speed Optimization

Latency and PING tool

Once you have bad latency you're stuck with it.

If you want to transfer a large file over your modem it might take several seconds, or even minutes. The less data you send, the less time it takes, but there's a limit. No matter how small the amount of data, for any particular network device there's always a minimum time that you can never beat. That's called the latency of the device. For a typical Ethernet connection the latency is usually about 0.3ms (milliseconds -- thousandths of a second). For a typical modem link the latency is usually about 100ms, about 300 times worse than Ethernet.

If you wanted to send ten characters over your 33kbit/sec modem link you might think the total transmission time would be:

80 bits / 33000 bits per second = 2.4ms.

but it doesn't. It takes 102.4ms because of the 100ms latency introduced by the modems at each end of the link.

If you want to send a large amount of data, say 100K, then that takes 25 seconds, and the 100ms latency isn't very noticable, but if you want send a smaller amount of data, say 100bytes, then the latency is more than the transmission time.

Why would you care about this? Why do small pieces of data matter? For most end-users it's the time it takes to transfer big files that annoys them, not small files, so they don't even think about latency when buying products. In fact if you look at the boxes modems come in, they proudly proclaim "14.4 kbps", "28.8 kbps" and "33.6 kbps", but they don't mention the latency anywhere. What most end-users don't know is that in the process of transferring those big files their computers have to send back and forth hundreds of little control messages, so the performance of small data packets directly affects the performance of everything else they do on the network.

Now, imagine the same scenario as before. You live in a world where the only network connection you can get to your house is a modem running over a telephone line. Your modem has a latency of 100ms, but you're doing something that needs lower latency. Maybe you're trying to do computer audio over the net. 100ms may not sound like very much, but it's enough to cause a noticable delay and echo in voice communications, which makes conversation difficult. Maybe you're trying to play an interactive game over the net. The game only sends tiny amounts of data, but that 100ms delay is making the interactivity of the game decidedly sluggish.

What can you do about this?

Nothing.

You can compress the data, but it doesn't help. It was already small to start with, and that 100ms latency is still there.

You can get 80 phone lines in parallel, and send one single bit over each phone line, but that 100ms latency is still there.

Once you've got yourself a device with bad latency there's absolutely nothing you can do about it (except throw out the device and get something else).

Fact Three: Current consumer devices have appallingly bad latency.

A typical Ethernet card has a latency less than 1ms. The Internet backbone as a whole also has very good latency. Here's a real-world example:

The distance from Stanford to Boston is 4320km.
The speed of light in vacuum is 300 x 10^6 m/s.
The speed of light in fibre is roughly 66% of the speed of light in vacuum.
The speed of light in fibre is 300 x 10^6 m/s * 0.66 = 200 x 10^6 m/s.
The one-way delay to Boston is 4320 km / 200 x 10^6 m/s = 21.6ms.
The round-trip time to Boston and back is 43.2ms.

The current ping time from Stanford to Boston over today's Internet is about 85ms:

[cheshire@nitro]$ ping -c 1 lcs.mit.edu
PING lcs.mit.edu (18.26.0.36): 56 data bytes
64 bytes from 18.26.0.36: icmp_seq=0 ttl=238 time=84.5 ms

So: the hardware of the Internet can currently achieve within a factor of two of the speed of light.

So the Internet is doing pretty well. It may get better with time, but we know it can never beat the speed of light. In other words, that 85ms round-trip time to Boston might reduce a bit, but it's never going to beat 43ms. The speed's going to get a bit better, but it's not going to double. We're already within a factor of two of the theoretical optimum. I think that's pretty good. Not many technologies can make that claim.

Compare this with a modem. Suppose you're 18km from your ISP (Internet Service Provider). At the speed of light in fibre (or the speed of electricity in copper, which is about the same) the latency should be:

18000 / (180 x 10^6 m/s) = 0.1ms

The latency over your modem is actually over 100ms. Modems are currently operating at level that's 1000 times worse than the speed of light. They have a long way to go before they get close to what the rest of the Internet is achieving.

Of course no modem link is ever going to have a latency of 0.1ms. I'm not expecting that. The important issue is the total end-to-end transmission delay for a packet -- the time from the moment the software makes the call to send the packet to the moment the last bit of the packet arrives the destination and the packet delivered to the software at the receiving end. The total end-to-end transmission delay is made up of fixed latency (including the speed-of-light propagation delay), plus the transmission time. For a 36 byte packet the transmission time is 10ms (the time it takes to send 288 bits at a rate of 28800 bits per second). When the actual transmission time is about 10ms, working to make the latency 0.1ms would be silly. All that's needed is that the latency should not be so huge that it completely overshadows the transmission time. For a modem that has a transmission rate of 28.8kb/s, a sensible latency target to aim for is about 5ms.

Fact Four: Making limited bandwidth go further is easy.

If you know you have limited bandwidth, there are many techniques you can use to reduce the problem.

Compression

If you know you have limited bandwidth, compression is one easy solution.

You can apply general purpose compression, such as gzip, to the data.

Even better, you can apply data-specific compression, because that gets much higher compression ratios. For example, still pictures can be compressed with JPEG, Wavelet compression, or GIF. Moving pictures can be compressed with MPEG, Motion JPEG, Cinepak, or one of the other QuickTime codecs. Audio can be compressed with uLaw, and English text files can be compressed with dictionary-based compression algorithms.

All of these compression techniques trade off use of CPU power in exchange for lower bandwidth requirements. There's no equivalent way to trade off use of extra CPU power to make up for poor latency.

All modern modems have compression algorithms built-in. Unfortunately, having your modem do compression is nowhere near as good as having your computer do it. Your computer has a powerful, expensive, fast processor in it. Your modem has a feeble, cheap, slow processor in it. There's no way your modem can compress data as well or as quickly as your computer can. In addition, in order to compress data, your modem has to hold on to the data until it has a block that's big enough to compress effectively. That adds latency, and once added, there's no way for you to get rid of latency. Also, the modem doesn't know what kind of data you are sending, so it can't use the superior data-specific compression algorithms. In fact, since most images and sounds on Web pages are compressed already, the modem's attempts to compress the data a second time is futile, and just adds more latency without giving any benefit.

This is not to say that having a modem do compression never helps. In the case where the host software at the endpoints is not very smart, and doesn't compress its data appropriately, then the modem's own compression can compensate somewhat for that deficiency and improve throughput. The point is that modem compression only helps dumb software, and it actually hurts smart software by adding extra delay. For someone planning to write dumb software this is no problem. For anyone planning to write smart software this should be a big cause for concern.

Bandwidth-conscious code

Another way to cope with limited bandwidth is to write programs that take care not to waste bandwidth.

For example, to reduce packet size, wherever possible Bolo uses bytes instead of 16-bit or 32-bit words.

For many kinds of interactive software like games, it's not important to carry a lot of data. What's important is that when the little bits of data are delivered, they are delivered quickly. Bolo was originally developed running over serial ports at 4800 bps and could support 8 players that way. Over 28.8 modems it can barely support 2 players with acceptable response time. Why? A direct-connect serial port at 4800 bps has a transmission delay of 2ms per byte, and a latency that is also 2ms. To deliver a typical ten byte Bolo packet takes 22ms. A 28800 bps modem has transmission delay of 0.28ms per byte, but a latency of 100ms, 50 times worse than the 4800 bps serial connection. Over the 28.8 modem, it takes 103ms to deliver a ten byte packet.

Send less data

A third way to cope with limited bandwidth is simply to send less data.

If you don't have enough bandwidth to send high resolution pictures, you can use lower resolution.

If you don't have enough bandwidth to send colour images, you can send black and white images, or send images with dramatically reduced colour detail (which is actually what NTSC television does).

If you don't have enough bandwidth to send 30 frames per second, you can send 15fps, or 5fps, or fewer.

Of course these tradeoffs are not pleasant, but they are possible. You can either choose to pay more money to run multiple circuits in parallel for more bandwidth, or you can choose to send less data to stay within the limited bandwidth you have available.

If the latency is not good enough to meet your needs you don't have the same option. Running multiple circuits in parallel won't make your latency any better, and sending less data won't improve it either.

Caching

One of the most effective techniques throughout all areas of computer science is caching, and that is just as true in networking.

If visit a web site, your Web browser can keep a copy of the text and images on your computer's hard disk. If you visit the same site again, all your Web browser has to do check that the copies it has stored are up to date -- i.e. check that the copies on the Web server haven't been changed since the date and time the previous copies were downloaded and cached on the local disk.

Checking the date and time a file was last modified is a tiny request to send across the network. This kind of request is so small that the throughput of your modem makes no difference -- latency is all that matters.

Recently companies have started providing CDROMs of entire Web sites to speed Web browsing. When browsing these Web sites, all the Web browser has to do is check the modification date of each file it accesses to make sure that the copy on the CDROM is up to date. It only has to download files that have changed since the CDROM was made. Since most of the large files on a Web site are images, and since images on a Web site change far less frequently than the HTML text files, in most cases very little data has to be transferred.

Since for the most part the Web browser is only doing small modification date queries to the Web server, the performance the user experiences is entirely dominated by the latency of the connection, and the throughput is virtually irrelevant.