Tools For Tracking Breaking News on Social Media

It was exciting and exhausting to try and keep up with the flood of news coming out of Iran in the wake of its disputed election on Flickr, YouTube, and Twitter, among others. My general approach is to search by keyword, sort the results by recency, and then start refining the search by adding or excluding keywords. For example, when I was searching Flickr, I ended up excluding a long list of non-Iranian city names to filter out protests of sympathy from outside Iran.

It ended up being overwhelming though, and I’ve since started relying on other people like Andrew Sullivan (who is back to blogging on a wider range of subjects) and Nico Pitney to filter signal from the noise.  I thought though that I’d take a minute to blog some of the tools I found helpful, and also some thoughts about the challenges I faced, and how to better deal with them.

In the middle of using Flickr to find photos of the first days protests, Flickr switched on (for a subset of users, at least) a new search results interface.  Flickr’s search was already pretty nice because it allowed you to filter out specific keywords, sort by date, and even filter by a range of dates.  The new results interface goes further.  First off, it allows you to chose from three different image sizes for the results, which makes it easier to screen photos.  It also includes a sidebar that highlights groups and photographers that might have photos relevant to your search.  Flickr’s search is pretty good, the only incremental improvement I’d like is if they made it easier to narrow, expand, or pan the date range on an existing search result set.  That, and giving me more visibility into the tags in the results would make it easier to refine my search terms.

I started using Twitter’s search for the #iranelection hashtag, but quickly got overwhelmed by retweets.  I ended up turning to Tweetmeme.  Tweetmeme aggregates links posted to Twitter and then lets you sort by relevance, # of tweets or age.  Even better, it lets you slice the results by the same criteria.  So, you can sort by relevance, but then limit the results to only show links tweeted in the past day and retweeted at least 100 times.

I should say that while I didn’t make use of it, Twitter’s advanced search allows limiting of results by time, and various other criteria that could be useful.  It would be nice if these were surfaced as suggested refinements on the search results page.

I also wanted an easy way to look for photos being posted to Twitter.  There are a few options,  I wasn’t too happy with any of them.  I ended up using Twicsy, which had the advantage of an interface optimized for scanning photos.  The downside was that it only showed photos from the past hour or so, and didn’t seem to include any ranking based on retweets.  Tweetmeme lets you filter your search to images, but it presents the results using their generic UI.  They show thumbnails for some of the results, but the thumbnails are tiny.

This posting is taking longer than I wanted, so I’m going to finish with a laundry list of the issues that I’m still having, accompanied in some cases by ideas of how to fix them.

  • Images, photos, text, and links get repeated.  Often I’ll want to find the early/original expressions.
  • The sides that sites deal with redundancy at all (like Tweetmeme), just seem to be counting links.  This helps, but really, I think more sophisticated content analysis is needed.  Analyzing links doesn’t help when different people repost the same image to Flickr, or the same video to YouTube, iReport, etc.
  • The first appearance of a given piece of content is important.  It helps establish credibility.
  • Reputation and history of the poster is important.  For example, Flickr photos from people who’d been posting photos from Iran for the past month or more tended to be more credible, and more likely to be original, than those from people who have been in San Francisco for the past year.
  • I don’t always want the most recent information.  I’m not watching this stuff minute by minute.  Sometimes I want to check in after a day or two, so, just letting me sort by recency isn’t good enough.  I need to be able to filter by day, or even hour intervals.

Leave a Reply

Your email address will not be published. Required fields are marked *