I’m helping “my friend Jeff”:http://jeffjlin.typepad.com/ with some of the preparations for the release of the release of the next “Harvey Danger album”:http://harveydanger.com. The current site runs on a shared hosting account with “a reputable provider”:http://www.pair.com, but the band will need a big upgrade to help with a promotional strategy that calls for the distribution of some big (50MB+) media files. This is where I come in.
To make a long story short, thanks to Moore’s law and Ebbert’s fraud, servers and bandwidth are pretty cheap these days. Even so, it makes sense to make the most of what you have. In this regard, “lighttpd”:http://www.lighttpd.net/ looks like it might be a better bet than Apache for dealing with serving up big files that take a while to download.
Hosting packages in the $150/month range come with allowances for as much as 2TB data transfer, and the machines typically speced are, by my estimates, capable of dishing out that much data and more.
2TB would allow over 30K dowloads/month, averaging almost 4Mbps and about 1 download/minute. Of course, nothing is average, and my initial assumption was that peak traffic would be 10x the average, or as much as 40Mbps and 10 downloads/minute. Further consideration made me doubt that assumption.
Normal traffic patterns on harveydanger.com have a ~10x difference between average and peak hours, but with promotion, those numbers could change quickly. A link or two on a well read site might bring thousands of eager downloaders in the space of an hour.
Rather than trying to build (and pay) for such big spikes in demand, we decided that slow downloads were acceptable when the flood of requests overtaxed the server’s 100Mbps network connect. Lots of failed downloads, on the other hand would be completely unacceptable. Preople might leave empty handed and never come back. Even worse, they might keep retrying, causing the situation to snowball further.
So, today I set out to see what we could expect from big spikes in traffic by setting up a series of tests.
I figured that as few as 50 simutaneous downloaders on a mix of DSL and Cable modem connections would be enough to saturate the network connection on a server. Once that happend, the number of pending downloads would start stacking up, which would really start putting a strain on server resources. The question was, how big of a strain.
For the web server, I used my home file server, a modest box with a ~500MHz processor and 512MB of memory hooked to a 100Mbps switch. The servers we’d be contracting for would come with 2x as much memory and a ~4x faster CPU, but I figured it would be a good way to get a baseline for things like memory consumption and CPU load.
To simulate the client connections, I relied on my desktop machine, an Athlon 64 with 1GB of memory hooked to the same switch.
First off, I installed “Apache2”:http://httpd.apache.org/docs/2.0/ using the “mpm_worker”:http://httpd.apache.org/docs/2.0/mod/worker.html module, which spawns a limited number of long-lived processes and handles connections by allocating threads from the processes in the pool. It should be the most memory-efficient way to deploy apache.
Getting the server going was easy, since there was a pre-assembled package for “Ubuntu”:http://www.ubuntulinux.org/. Getting the load testing client nailed down wasn’t as straightforward. I started out using “Jmeter”:http://jakarta.apache.org/jmeter/, but quickly ran into problems.
Jmeter has worked well for me in the past, but the 50MB+ file I was downloading was too much for it. Even with only one or two concurrent users, it would run out of memory and crash after just a few minutes of operation. I bumped up the available memory for the JVM running JMeter, but even with 1GB allocated, it wasn’t enough to keep up even a modest load, nevermind 50+ simutaneous requests.
Next up, I tried compiling “Apache Flood”:http://httpd.apache.org/test/flood/, but it wouldn’t build properly, and I gave up. Next up was “Seige”:http://www.joedog.org/siege/, which compiled on Ubuntu running in a VirtualPC without too much hassle. It worked well enough to start running some tests, but then things went south. I shut the VirtualPC down in order to give it more memory, but it wouldn’t boot back up again.
I decided to download “Cygwin”:http://www.cygwin.com/ and try building Siege and running it directly on WinXP. After a few false starts to install missing packages, I got it to build. It even seemed to use a lot less CPU than when I was running it under Linux in a VirtualPC.
Sure enough, the little server was able to saturate the 100Mbps ethernet connection with 50 concurrent requests. I tried bumping it up further, but Seige started giving all sorts of network errors and crashing. I was dispairing of how I was going to do a real load test when I realized that I didn’t have the problem if I just started multiple copies of Siege and didn’t let any of them spawn more than 50 requests. Even this had its difficulties though. I often had to kill Siege once requests started completing because it seemed to be making WindowsXP lock-up and stop passing TCP/IP traffic until I killed the process.
Apache did a decent job with resource utilization. With 300 concurrent requests, 25 threads per process and 250 maximum connections, Apache had 10 processes running, each with about 600kb resident in RAM. With 200 concurrent requests limited to 150 connections, it had 6 processes of 1.5-MB each. 6-10MB isn’t bad at all. Even with 1000 concurrent requests you’d probably need less than 100MB for apache, which would still leave lots of memory for caching the limited number of large media files.
CPU utilization was pretty bad though. There was no idle time and the load average was 50+. Even basic operations at the command line seemed slow.
Next up, I tried Apache2 with the prefork module, which uses a pool of processes, where each processes only serves a single connection. This is very similar to the way the still widely deployed apache 1.3 works. As expected, this consumes a lot of memory. When apache was set to allow 200 active processes and I hit it with 250 concurrent requests, it was running 840k/process, or nearly 160MB. Serving more active requests would only push that number higher. CPU utilization and load averages were similar to those with the threaded version of apache.
From here, I turned to the specialists and went looking for HTTP servers that prioritized resource effieciency for basic file-serving over flexibility and dynamic content.
First up was “thttpd”:http://www.acme.com/software/thttpd/ from Acme Software. The “t” stands for throttling because thttpd allows you to restrict the bandwidth available for serving all or some of the files on a website (a nice feature that could come in handy). I obtained the sourcecode, compiled it and ran it.
The results were impressive. Memory utilization was higher than with the threaded Apache, settling in at 41MB or so for a single process, but the CPU utilization was much better, there was ~20% idletime even when serving 100-150 simutaneous requests and the load average was never more than ~2.
Last stop was “lighttpd”:http://www.lighttpd.net/, which offers much of the resource efficiency of thttpd, while offering better support for dynamic content (not needed for this application, but, not a liability either). I downloaded the latest release and got it compiled and running with minimal effort. I didn’t pay as much attention to its behavior under lower loads, but it was quickly clear that it was well suited to high concurrency.
With 250 simutaneous reqeusts it only used 2.8MB of RAM. Load hovered at ~2 and the CPU was, as usual, pegged. Bumping up to 400 requests pushed RAM use to 4.4MB. Even with 500 users, it continued humming along, and the server remained relatively responsive.
My conclusion is that Apache with mpm_worker would probably be suitable for this application, but I’m inclined to go with lighttpd since it makes the absolute best use of memory of all the options tested.
One thing I need to investigate further is the huge hit in throughput I noticed whenever relatively large and CPU hungry monitoring tasks that used perl scripts fired up.
(friend of jeff lin’s, comment repasted here)
Definitely if you’re serving static mp3s you want to use as tiny a webserver as possible. Supporting resuming downloads is good, too. Make sure that it’s compiled with (and running on a computer that supports) the sendfile API: http://www.linuxgazette.com/issue91/tranter.html . You might want to consider bittorrent (though I doubt most of your fans will understand this, the internet-savvy will and it’ll save you a huge amount of bandwidth costs).
At 50k/sec, a 60mb album takes more than 20 minutes (when you factor in TCP, etc. overhead). This means that once your initial 250 connections are in use, everyone else is gonna have to sit there twiddling their thumbs; you’ll only push out 750 copies of it in an hour.
But far worse (at least in LJ’s experience) is that there are many users who are much slower than 50k/sec, and they’re going to be tying up CPU resources and RAM as your computer spoonfeeds them the bytes at baby speeds.
This is probably more than you want to know but also fascinating: http://www.danga.com/words/2004_oscon/
Last I heard, it’s difficult (perhaps impossible?) for a PC to saturate a 100mbit connection — you can’t push (properly checksummed, etc) data at the net card fast enough. Other things (CPU, RAM) are far more likely to be the bottleneck.
Listen to Evan, he knows what he’s talking about; he TA’d our Networking class — twice.
Thanks for the comments, Evan.
My assumption is that any competitive webserver will support the sendfile API, given how long its been around. Perhaps that is an optimistic assumption, but it appears that Apache2 and lighttpd both use it when available. I’m less sure about thttpd without looking at the source, but given its performance and the fact that it (like the others) was spending almost no CPU time in user-mode, I’d guess it does as well.
This page by Dan Kegel (http://www.kegel.com/c10k.html) has a nice review of strategies for high-concurrency webserving, including key innovations at the OS level. It provided the initial inspiration to the author of lighttpd.
I’m not sure I understand your comment about the inability of a PC to saturate at 100mbit connection. My instrumentation might leave a bit desired, but I was seeing (and believing) ~99% utilization once I was in the >10 concurrent user range. Other people have reported similar results with similarly modest commodity PC hardware. Smaller files might present a problem, but I doubt we’ll have anything smaller than 55MB on these servers.
My main aim in these trials was to get an idea of resource consumption under the high levels of concurrency that could result from a network saturating surge and/or a bunch of users with slow connections. It looks to me like lighttpd is most appropriate for the task. I did a run this evening with 1050 concurrent connections and its memory consumption was only ~20MB. It’s actually got me wondering if I can find a good provider that will provision GigE for a reasonable price.
I appreciate the point about the ability of slow users to stifle downloads even when plenty of bandwidth is available and that high concurrency support could be even more important when we aren’t bandwidth constrained on the server end.
I shared your skepticism about the value of bittorrent for this application (beyond PR value), and was treating it as a checklist item to be attended to after everything else is well in hand, but I’m warming to it now. My ill founded gut feeling is that it might account for 25% of downloads.
I strongly on the value of http-resume support for this application so that we don’t get an extra bandwidth (and resource hit) for connections that fail under load. I’d not really considered it until I ran across it in someone’s webserver shootout. lighttpd appears to support it.
In case it isn’t obvious, I’m leaning strongly towards lighttpd at this point.
Pingback: GeekFun » Blog Archive » Hello Slashdotters