9/11/2011

Where are the bodies buried on the web? Big data for journalists

 

The following post is the introduction to the free online ebook 'Where are the bodies buried on the web? Big data for journalists' published by Pete Warden in January this year. Former Apple engineer, Pete Warden is the CTO of Jetpac, founder of OpenHeatMap.com, O'Reilly Data Source Handbook author, and writes on large-scale data processing and visualization for ReadWriteWeb. 

 

There’s been a revolution in data over the last few years, driven by an astonishing drop in the price of gathering and analyzing massive amounts of information. It only cost me $120 to gather, analyze and visualize 220 million public Facebook profiles. You can use 80legs to download a million web pages for just $2.20.

 

facebook_profiles.png

'How to split up the US': visualisation of connections drawn between places that share friends based on 210 million public Facebook profiles

 

The technology is not just getting cheaper, it’s also getting easier to use. Companies like Extractiv and Needlebase are creating point-and-click tools for gathering data from almost any site on the web, and every other stage of the analysis process is getting radically simpler too.

What does this mean for journalists? You no longer have to be a technical specialist to find exciting, convincing and surprising data for your stories. The rest of this short guide will cover my favorite resources, along with a few examples of how they’ve been used to create compelling journalism.

Download the full free online ebook.  

Comments