Category Archives: General

Google Semantic Search of Wikipedia

I was just googling and came across something I’ve never seen before. The first result for “bill gates children” was “Bill Gates — Children: 3” with an attribution to wikipedia.

Google Semantic Search?

I couldn’t repeat the result searching for the children of other public figures, but then I tried “Bill Gates born”, and got another semantic result from Wikipedia:

google semantic search?

The born search works for lots of public figures, like Elvis and Nixon.

This really surprised me, especially since I haven’t seen anyone else talking about it. I wonder if this is a response to microsoft’s recent acquistion of Powerset and an answer to those wondering whether the Microsoft/Powerset combo would beat Google.

Searching Powerset for “Bill Gates Born” delivers a similar looking result as the top item. (Update: I just noticed that Powerset’s summary result is actually from Freebase. Dumb that I didn’t notice, because my first thought was that that was where Google was pulling their info from.

Already Dissappointed in Obama

I’m too realistic (or is it cynical) to invest too much hope in any politician, including Barak Obama. Even so, Obama has managed to disappoint me hugely by indicating that he’ll support a version of the FISA intelligence bill that grants immunity from prosecution to the big telecom companies who cooperated with illegal government domestic wiretapping.

That the collusion of major corporations in the establishment of a police state is even an issue is the most disappointing thing of all, but 8 years of George Bush & Dick Cheney are about to come to an end. Obama is part of undoing this sick state of affairs.

Granting immunity is a step in the wrong direction. It doesn’t just perpetuate the current state of affairs, it reinforces it. It makes it harder to prosecute past illegal activity, and removes any incentives to avoid participating it in the future.

Presumably Obama is willing to take this position because he thinks it will make him more electable, but its the sort of compromise that makes me think that he’ll be no champion of the constitution if he is elected, that he’ll be too happy to make unacceptable compromises in terms of some other goal.

I am all for compromise in politics, but I think compromising the constitution is a line that should not be crossed. It makes me question other compromises he must make. Will he help expand the ethanol industry, and squander our top-soil and fresh water in the process? Will he support covert or military action in Iran, thereby insuring more blood, treasure, and credibility will be squandered in the middle instead of applied directly to problems at home?

Goodtimes for Powerset, Hard times for Hadoop?

Yahoo’s troubles and a recent Microsoft acquisition could be bad news for open source software that enables “internet-scale” computing.

Hadoop is a project to build an open source version of the infrastruture that Google uses to process data. It provides a huge filesystem that can be distributed over dozens or even thousands of computers (analogous to GFS), as well as support for processing all that data in parallel in the same way Google does when they build and update their index of the web (using MapReduce). It also provides HBase a distributed database that is built on top of the filesytem in the manner of Google’s BigTable. Hadoop is a spin-off of the Nutch project to build an opensource search engine that could index a significant portion of the web.

Most of the work on Hadoop and HBase has been supported by Yahoo, and a lot of the recent work was supported by a semantic-search startup called Powerset. In fact, a quick look at the personnel on the project shows that it is dominated by people from those two companies.

Given that Yahoo is in turmoil, and has been showing some signs of reconsidering their search business, and given that Powerset was just bought by Microsoft, who likely already has its own infrastructure for these sorts of applications, I have to wonder what will happen with Hadoop.