« Previous|Main|Next »

Treating radio 4 output as data

Post categories: , , 

David Rogers15:38, Tuesday, 25 August 2009

Editor's note: BBC techies have been working with their counterparts at The Guardian and elsewhere to build new sources of data - in this case data about the media appearances of our MPs - SB.

mps_appearances_600.png

At the end of July the Guardian held an internal hackday at their offices in King's Cross. They invited two engineers from BBC Radio's A&Mi department, Chris Lowis and David Rogers. We teamed up with Leigh Dodds & Ian Davis from Semantic Web specialists, Talis to produce an 'Interactive-MP-Media-Appearance-Timeline' by mashing up data from BBC Programmes and the Guardian's website.

Before the event Talis extracted data about MPs from the Guardian's Open Platform API and converted it into a Linked Datastore. This store contains data about every British MP, the Guardian articles in which they have appeared, a photo, related links and other data. Talis also provide a SPARQL endpoint to allow searching and extraction of the data from the store.

Coincidentally, the BBC programmes data is also available as a linked datastore. By crawling this data using the MP's name as the search key we were able to extract information about the TV and radio programmes in which a given MP had appeared. A second datastore was created from the combination of these two datasets, and by pulling in some related data from dbpedia. Using this new datastore we created a web application containing an embedded visualisation of the data.

Continue to read this post and leave comments on the BBC Internet blog, where it originally appeared.

.

More from this blog...

Topical posts on this blog

%28none%29

Categories

These are some of the popular topics this blog covers.