The Clods Attack RSSCloud

3 Replies

Earlier this summer, Dave Weiner seemed to decide that he wasn’t going to convince anyone else to provide an alternative to Twitter by making RSS more realtime, so he set out to do it himself. In late July, he described his plans on RSSCloud.org, and just a few days ago he announced the first client-side implementation in his River2 RSS aggregator and just today, Matt Mullenweg of Automattic announced that all WordPress.com hosted blogs are enabled to give realtime notifications via RSSCloud (there is also an RSSCloud plugin for self-hosted WordPress).

Predictable, this progress has just turned up the volume on the nay-sayers. Some of their criticisms are reasonable, but once again the Internet reminds me how willing people are to speak with authority that is exceeded only by their ignorance (I leave it to the reader to decide what my arrogance to ignorance ratio is). I should know better than to wade in and engage with such people, but I did it anyway after seeing some of the comments on one critical post. Having made the effort, I thought I might as well repost it on my own blog. The comment that put me over the edge was from someone calling himself “Roger” flinging a criticism of those (Dave Weiner, presumably) who’d failed to learn the lessons of the past:

Roger (not Rogers), as our self-anointed historian, could you please recount the the casualties of the first great RSS aggregator invasion that you think fell in vain?

I remember the fear. I don’t remember much though in the way of casualties. People adjusted the default retry intervals their aggregators shipped with and implemented the client side of Etag’s and proper HTTP HEAD requests, server side software followed suit. The rough edges weren’t fixed overnight, but that was fine, since RSS didn’t take off overnight, and even with Automattic’s support, RSSCloud isn’t going to either.

As for those worried about all the poor webservers just getting hammered every time an update notification goes out to the cloud, is that really an issue? I mean, event driven webservers like nginx or lighttpd can retire something like 10K requests a second on relatively modest hardware and support thousands of concurrent at a time out of a few MB of memory. Yes, that throughput is for static files, but just how often is that RSS feed changing? Even if your RSS feed is served dynamically, you can put nginx in front of apache as a reverse proxy or whatever and set up a rule to cache requests to whatever your RSS feed URL is for 1s.

As for the strain caused by delivering the notifications themselves, the same techniques that have made it possible to serve thousands of requests a second from a modest server are applicable for sending notifications, thought, at this point. Someone just has to write a cloud server that uses them. They can probably start with the software script-kiddies use to send spam 🙂

The critique of the firewall issues, etc, are the only ones that make sense to me. It seems like that needs to be turned around to use one of the “comet” techniques.

I do hope Dave will reconsider the aggregator-cloud interface. Inspiration might be found in the ideas behind “comet,” which allows clients to open a connection to a server and then allows the server to push notifications back to the client. Most of the focus in Comet is on clients running as javascript in web browsers, but the techniques for pushing information back from the server seem applicable. The overall approach avoids a lot of firewall and security issues with having a webserver running on the client as it does in the current RSSCloud proposal, and simplifies the cloud servers need to age out clients (it still has to deal with sockets that go dead).

A Tempermental Chef, or Something More APT for Configuration Management

7 Replies

If this makes sense to you, HELP, otherwise, nothing to see here, move along.

I’m trying to simplify the task of configuring and maintaining Linux servers at work and I want to build on some existing configuration management system to do so. We are using Ubuntu Linux distribution, and I was thinking of just building on the APT package management tools they’ve borrowed (among other things) from Debian, but I decided to look for something distro agnostic.

I’ve spent a lot of time and frustration the last day trying to get the server working for Chef, a new system written in Ruby. I spent time scanning their bugtracker and asking for help in their IRC group to no avail. It still doesn’t work, and I have no more idea why than when I started.

I’m really doubting my decision:

Chef has only been packaged for the bleeding edge version of Ubuntu. Um yeah, great, I really want to use beta software on my SERVERS.
The installation documemtation advises that I dowload and install Ruby Gems from a tarball becuase the version in the Ubuntu repositories isn’t to their liking. Great, I have to install extra shit by hand before I can use the software I want to use so I don’t have to install shit by hand. That’s efficient, right?
Chef relies on OpenID for authentication. Sweet! I can use my MySpace account to manage my servers! Well, I could, if only I could figure out the appropriate URL for the myspace authentication endpoint (and I was batshit insane). As for how I integrate OpenID authentication with anything else I’m using, I’m sure it will be easy and obvious what to do, in a year or two.
Oh yeah, I forgot the most important thing: It doesn’t work. At least it doesn’t work for me. I’ve installed all the prerequisites, I’ve run their “installer,” and I can even get to the login page of “chef-server” but when I actually try to log in, it falls down and goes BOOM. I get a generic error page warning me about a socket error. I tried to diagnose it myself to no avail, there wasn’t anything in the log files because…
Chef server truncates its log files willy nilly. It actually writes a fair amount of info to its log file, but you’d never know by looking at it after the fact, because after every request, it ends up as a zero-length file. Useful, huh? The trick is to ‘tail -F’ the file before restarting chef-server. This prints the output as it is written to the file, and reopens the file each time it gets truncated, which happens multiple times during the request.
- For what it is worth, I figured out what was wrong here, for some bizzarre reason, the hosts file on the machine was only readable by root, which casued lookups for localhost to fail when chef-server was trying to connect to the couchDB server.

Now, to be fair, the Chef site makes it clear in a nice green sidebar that Chef is young and a work in progress. I knew that when I started with it. I didn’t expect it to be production ready, but I thought it was far enough along to start working with. Clearly, I’m reconsidering that.

I’m also reconsidering the assumption that sent me to Chef in the first place, that it was desirable, at this point, not to take a dependancy on a specific Linux distribution by trying to build off of APT, the package distribution and management system at the heart of Debian and Ubuntu. The truth is APT is awesome. One of the reasons given for creating Chef was that Puppet, an earlier Ruby-based configuration management system choked on dependancy management. I haven’t seen that complaint about APT, not lately, in fact, that’s one of the things they love most about Debian and Ubuntu, people love it so much that say things like “I want apt to bear my children,” or words to that effect.

So, my thought is that I create my own apt repository. I’ll create derivates of the ubuntu packages I need custom versions of, and I’ll create configuration packages derived from their configuration packages whenever possible. Machine and role specific packages can be used to manage rollouts and/or I can different repository tiers for different classes of servers, in much the same way that Debian and Ubuntu have different tiers for testing, stable, unstable, etc. I’m sure I’ll run in to headaches on the way, but it least they will be headaches that other people have suffered and I can learn from their experience.

Tips on Using Ganeti to Manage a KVM based Virtual Machine Cluster on Ubunty Jaunty Jackelope 9.04

6 Replies

Update: I apologize for not updating this post. I struggled with this for quite a while before making real progress, which I’ll try to detail. A few key points:

debootstrap doesn’t install a bootloader, so even if you are using kvm, you need to specify a kernel on the parent/host and a root disk device (on in the vm) as part of the config. Make sure that the kernel matches the modules installed by debootstrap, or you’ll have lots of other problems.
The default use of virtio for the disk interface causes problems with the kvm version that ships with ubuntu. The virtual machines bios may not detect it. Specify IDE for less hassle.

I’ve hacked up the ganeti-os-debootstrap scripts to use ubuntu’s vmbuilder script to create ubuntu VMs that do have a boot loader. I need to do a little cleanup and then I’ll share my work.

————————————

We are using a number of virtual machines to support the efforts at work. We’ve been running these on VMWare server on some Linux servers for the past year, but I’m looking at moving on from there to something that is based on more open software. I wanted to share some of the reasons behind the choices I made, and how I got over some of the obstacles I encountered with my choice of Ubuntu Jaunty Jackelope (9.04) for my OS, KVM for virtualization and Ganeti to manage the virtual machines. This won’t be exhaustive, but hopefully it will help other people.

I’d been eying Ganeti, a package for managing multiple Xen or KVM virtual machines running on a cluster of hosts. I was particularly intrigued because Ganeti went so far as managing redundant storage via DRBD. Still, I took a look at Eucalyptus because it implements significant portions of the Amazon Web Services API for provisioning system instances and perisistent storage. I was even more intrigued when I discovered that they supported both S3 (a key-value store) and EBS (a block-based storage layer). I ended up choosing Ganeti though. Eucalyptus required me to configure a shared highly available storage layer, something that Ganeti largely handled for me. More importantly a limitation in some of the software they integrated to provide EBS, meant that I couldn’t run instances that used EBS volumes on the same machine that was providing the EBS storage service, which wasn’t acceptable for the small 2-4 host cluster I planned on building.

I also had the choice of the Xen or LVM hypervisors. I chose LVM because it is supposed to be better supported by Ubuntu, and, in the long run, looks like it will become the favored choice of Redhat as well since.

Installing Ganeti:

There is a version of Ganeti packaged for Ubuntu, but it is an older version that doesn’t support the features that most interest me that are only available in v2.0, so I I worked from the Ganeti 2.0 installation document. I ran into a few problems because it is skewed towards using the Xen hypervisor and Debian, while I wanted to use the KVM hypervisor on Ubuntu.

The first issue I hit was in trying to install the DRBD prerequisite. DRBD mirrors block devices over the network, providing an important piece of the fault tolerance and high availability puzzle that Ganeti builds on. Ganeti requires a more recent version of DRBD. Earlier versions of Ubuntu and Debian package this version, but Jaunty only has a package for an earlier version of DRBD. Stranger still, it has utilities for managing the more recent version. With a little digging, I found that the modules for DRBD8 are actually packaged with the server kernel. So, my first problem was no problem at all.

Initializing and Running Ganeti:

The next issue I hit is with changes Jaunty made to the default python path, and the fact that the implications of those changes hadn’t propagated everywhere they needed to go. The result is that once I installed Ganeti, I got a python import error when trying to run ‘gnt-cluster init.’ My solution was to move ‘ganeti’ from ‘site-packages’ to ‘dist-packages.’

The next problem I ran into is that I wasn’t using Xen. I knew enough the first time through not to bother creating symlinks for a Xen instance kernel, but I didn’t really know what to do instead. In trying to figure that out, I realized that I should have specified that the default hypervisor be kvm when I initialized the cluster. Even though Xen wasn’t installed, it defaulted to Xen. So, I had to destroy the cluster and initialize a new one:

gnt-cluster init --default-hypervisor=kvm myclustername

Default Kernel for New Instances:

When I first tried to create a new instance, I got this:

gnt-instance add -t plain -s1G -n vmhost3 -o debootstrap vm1.office.alki.comFailure: command execution error:

Hypervisor parameter validation failed on node vmhost3.office.alki.com: Instance kernel '/boot/vmlinuz-2.6-kvmU' not found or not a fi

# gnt-instance add -t plain -s1G -n vmhost3 -o debootstrap vm1.ournet.net

Failure: command execution error:
Hypervisor parameter validation failed on node vmhost3: Instance kernel '/boot/vmlinuz-2.6-kvmU' not found or not a file

It looks like the solution to this problem is to adapt the instructions for creating symlinks for a Xen instance kernel, and link /boot/vmlinuz-2.6-kvmU to my current server kernel. I have a feeling that I’ll be using a more stripped down kernel once I figure out how this all fits together.

OS Support Files for Creating New Instances:

The ganeti install cover installing the OS support files, but it seems the default configuration option puts the files in ‘/usr/local/share/ganeti/os,’ rather than ‘/srv/ganeti/os.’ The README that comes with the support files suggests more appropriate configuration options:

./configure --prefix=/usr --localstatedir=/var \
    --sysconfdir=/etc \
    --with-os-dir=/srv/ganeti/os
  make && make install

That seems to do the trick: gnt-os list includes debootstrap, and creating a new instance seems to work as expected and I type this, it seems to be starting up!

Connecting a Console to a Running Instance:

When I first ran ‘gnt-instance console instancename’ I got an error that /usr/bin/socat was missing. Installing it with ‘aptitude install socat’ but the console doesn’t seem responsive, and a kvm process has been using 100% of one core for about 5 minutes now.

Accessing an Instances Disks:

As part of my debugging, I wanted to try to access the disk image of the instance to see if the log files showed anything. This was a challenge in and of itself. From the Ganeti documentation, I thought that running ‘gnt-instance activate-disks instancename’ would give me the name of a device I could mount, but doing so generated an error”

mount: wrong fs type, bad option, bad superblock on /dev/mapper/xenvg-a505f631--72fe--4100--a7e5--b3efae6d8082.disk0,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so

A little digging and I learned that the virtual disk actually had partitions, which needed to be mapped before I could mount the partition.

# gnt-instance activate-disks vm1
vmhost3.office.alki.com:disk/0:/dev/xenvg/a505f631-72fe-4100-a7e5-b3efae6d8082.disk0
# kpartx -av /dev/xenvg/a505f631-72fe-4100-a7e5-b3efae6d8082.disk0
add map xenvg-a505f631--72fe--4100--a7e5--b3efae6d8082.disk0p1 (252:7): 0 2088449 linear /dev/xenvg/a505f631-72fe-4100-a7e5-b3efae6d8082.disk0 1
# mount -t ext3 /dev/mapper/xenvg-a505f631--72fe--4100--a7e5--b3efae6d8082.disk0p1 /mnt/kvm-image
# ls /mnt/kvm-image/
bin  boot  dev  etc  home  lib  lost+found  media  mnt  opt  proc  root  sbin  selinux  srv  sys  tmp  usr  var

From checking the log directory, it is clear that whatever is going on, it’s never getting to the point where it can write to a log file.

Hmmmm, maybe it has something to do with the fact that there is no kernel or initrd? Could that be it, maybe? Hmmm.

UPDATE: As of this writing, I still don’t have an instance running successfully. I’m going to spend a little more time trying to get it to work and then probably cut bait in use the basic vm management tools ubuntu provides.

The Ganeti community seems pretty thin. The Google group has had undealt-with spam for the last few days, and an appeal for help I posted hasn’t drawn any response. I found an IRC group on Freenode, but there are only two other people in it, and they may well be dead. It’s too bad, because it seems like cool software. I guess the other option is to try using Xen instead of KVM, and/or try using the packaged version in the universe repository.

Google’s ChromeOS Doesn’t Have to be Popular to Matter

Ditching Webfaction for a Linode VPS

1 Reply

A few months back I moved our websites off of Pair.com to Webfaction in search of better performance.

I still haven’t cancelled my account with Pair, but I’m already planning on leaving webfaction. The performance is there, the reliability is less than I’d like.

Our sites were down for something around 12 hours a few months back. I couldn’t even get to their support website to file a trouble-ticket and I never saw any explanation or even acknowledgement of the problem. More often, I’ll get a “bad gateway” error because the backend apache server running my sites failed and hadn’t restarted yet. This has hit my wife particularly hard because it keeps happening in the middle of writing posts for her blog, and she’s lost work.

So, I’m going to get a virtual private server from Linode.com. I avoided this in the first place because I didn’t want to deal with system administration, but truth is, it isn’t going to be that much work, and its going to be more expensive. The upside is that I’ll have complete control to tweak things.

Update: Shortly after posting this, someone at Webfaction saw my post and emailed me, offering to help with the bad-gateway problem. I’ve been giving it a try. It seems potentially better, but it has eats more aggressively into my memory allocation. We also went back and forth about the downtime I had at the end of may, but I still don’t have an explanation I’m satisfied with. I am satisfied though that they are going to make their support offering more robust. I’ll see how it goes. I’d already paid for a month of service at Linode, so I’ve been tinkering with getting a VPS set up to see how it performs.

Tools For Tracking Breaking News on Social Media

GeekFun

What I'm Thinking About

The Clods Attack RSSCloud

A Tempermental Chef, or Something More APT for Configuration Management

Tips on Using Ganeti to Manage a KVM based Virtual Machine Cluster on Ubunty Jaunty Jackelope 9.04

Google’s ChromeOS Doesn’t Have to be Popular to Matter

Ditching Webfaction for a Linode VPS

Tools For Tracking Breaking News on Social Media

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: