Google has launched a new open-source project called Refine (formerly Metaweb's Freebase Gridworks) which allows users to easily clean up and transform large datasets. There is nothing more painful than cleaning data at the command line - I'd even go so far as to say it's impossible to do a good job. Sorry, R. Excel [...]