I remember when I was first told that the British Library kept a copy of every single book, newspaper and magazine published in the UK.
It seemed like a magical idea – how could one building have enough space to contain all of these copies? I’m sure I wasn’t the only awestruck kid who had mental images of teams of hard-pressed workers, building new bookshelves as fast as they could to keep up with the daily deliveries of thousands more books . . .
I doubt if those charged with maintaining our national cultural memory anticipated the challenges that the internet would bring. In addition to keep a copy of every book published, the British Library is tasked with archiving online material too. It is joined by the newly created UK Web Archive, which was set up to help prevent a digital black hole, by storing a copy of UK websites.
The problem is that the UK Web Archive, British Library and other archives are faced with a quirk in the law. This requires libraries to identify and then seek permission from each individual site’s webmaster before adding a site to the archive. It’s as painful as it sounds.
What happens to the thousands of websites which are created and later deleted every day? Wouldn’t people be interested in, say, future prime ministers or Nobel Prize winners’ first blog posts? Or a record of events or festivals, which become lost in time and cyberspace?
In many respects, of course, the failure to properly archive our output isn’t new.
In the early days of TV, many programmes now considered to be of cultural significance were wiped. Tapes were expensive and to save money the BBC and ITV would wipe them and record over them. In this way, early episodes of Doctor Who, Hancock’s Half Hour and Dad’s Army have all disappeared.
At the time, TV shows like these were considered ephemeral but now we appreciate their cultural significance. It looks like we’ve learnt nothing from our past mistakes, but anyone who’s been following the Digital Economy Bill debate will know that copyright law is a mine-field – it’s time legislation was updated to bring it up to date.
Perhaps an opt-out, rather than opt-in, default is the answer? When websites are created, it wouldn’t be hard to include a metatag for webmasters that didn’t want their site to be archived, similar to the way that the No-Follow tags works for sites like Wikipedia to prevent spamming.
Initiatives like Creative Commons – a licensing system that allows you keep your copyright but allow people to copy and distribute your work provided they give you credit – are great examples to us and by default encourage archiving. Sharing is in the spirit of the internet – surely that should apply to archiving too?