Monthly Archives: September 2002

Spam Detection A refinement of

Spam Detection A refinement of an earlier approach to filtering spam using statistical methods. Thanks to Jason for the pointer.

The original article inspired me to create a java implementation. Since I haven’t written any of the analysis code yet, it should be easy to incorporate some of the proposed improvements.

Philosophy in re RSS 2.0

Philosophy in re RSS 2.0

If you are the developer of a news aggregator such as NetNewsWire, Straw or Aggie, or the one built into Radio UserLand, you can upgrade your software to take advantage of a new feature in RSS 2.0, item-level globally unique identifiers, or guids. This section explains how to do that.

Now thats what I’m talking about!

More on RSS aggregators: Perhaps

More on RSS aggregators:

Perhaps someone is doing this in a personal aggregator already, but it would be cool to be able to group items from disparate feed that point to the same URL together.

I had started a post

I had started a post about my experience playing with RSS aggregators, but then due to a horrid twist of fate, my broswer managed to eat it.

Its just as well though, because after doing a little digging, I realize that the things that bother my most about RSS aggregators result from the shortcomings in the protocol.

I can only hope that:
a) The new version that is being hashed out addresses these shortcomings.
b) The new spec is quickly adopted by those making tools to produce and process RSS feeds.

The primary annoyances:

1) RSS 0.91 Does not guarantee the presence of a unqiue identifier for each item in the feed, especially not one that remains constant across time. Some items may include the permalink for the item in question, but others may contain no link at all, or a link to a page refered to in the item. As a result, there is no way to filter feeds and only show the latest version of a given item. This is verry annoying.

Part of the appeal of an RSS aggregator is that it provides.a way to streamline ones consumption of blogs. Seing the same post over and over because an author has made repeated updates perturbs the flow.

2) RSS 0.91 appearantly doesn’t include a timestamp with items, so that all that can be reported to the user is the time the aggregator picked up the item from the XML feed. Not much help if you have returned to your computer after a weekend away.

I am sure people with more experience have their own issues. In anycase, I have pretty much concluded that aggregation of RSS 0.91 feeds is probably going to annoy me rather than make my life easier.

” Seventy-eight full-color reasons to

Seventy-eight full-color reasons to be glad you’re not in high-school any more.”
[via memepool.com]

3 men detained for hours

3 men detained for hours in Florida terror scare – timesunion.com

“I appreciate someone like her with the courage to do it,” said neighbor Eric Finch. “For anyone to sit around and joke over a cup of coffee about a couple of thousand people being killed — they should be prosecuted just for that.”

So, is should be a crime to joke about something like that in general, or only if you look like you might be an Arab? I wish the reporter had pressed Mr. Finch further.