Open Knowledge Festival 2014

This is a guest blog post by Jonas Öberg, CEO/Founder of Commons Machinery and Shuttleworth Foundation Fellow. He will be leading the OKFestival session “Give credit, where credit is due“.

Take a moment to look at these two images. You may well recognize the style of Randall Munroe in the left image, it’s comic #1369 from the XKCD comic series (licensed under Creative Commons Attribution-NonCommercial 2.5 by the way). If you’re a long time XKCD reader, you may well recognize the right image too. It is also from XKCD, comic #34, drawn by Randall Munroe with pencil on paper, then inverted and colored.

If we take either of these images out of their context and show them to a random selection of people, chances are that they’ll be able to recognize the left image, and correctly identifying it as belonging to the XKCD series (or even that it was drawn by Randall Munroe), but they’ll stumble a bit more on identifying the flowers.

I’m showing these, or a variation of these, to audiences around the world to illustrate how, if we take images out of their context, they lose some of the meaning and value. The flowers mean something more to us when we know they were drawn by Randall Munroe and is part of the XKCD series. And the same is true for any digital work, be it an image, a music composition, a text or a computer program.

“Credit where credit is due!” – Unknown

I believe that getting credit for the work you do, and the work you share with others, is critical in today’s society: it helps build reputation and connections, it creates value and meaning. For any creator publishing their works online being able to convey more information about each work beyond the work itself is important for conveying the meaning of it. An image is less worth without knowing who created it. Any XKCD comic loses part of its meaning without the alt text.

This photograph, by itself, can mean a lot of things to different people, but by associating it with the right context, we can say that it depicts flight test barrels used to test aircraft under different centers of gravity and not proof of chemtrails. It was taken and uploaded to Wikipedia in 2011 by Olivier Cleynen and licensed under Creative Commons Attribution-ShareAlike 3.0.

Despite the prevalence of EXIF and IPTC metadata fields for images (and similarly for other media types), the sad state of affairs today is that most or all of such metadata fields are stripped away, on purpose or by accident, when images are passed around the Internet. The Embedded Metadata Manifesto did a test in March 2013 showing that Facebook, Flickr (free accounts), Instagram, Twitter, Twitpic and other services strip away both EXIF and IPTC metadata fields for images that pass through their systems.

Even services which retain such information, like Pinterest and Tumblr, never does anything with the information included. They never display it and never act on it. Not only does this lead to lost information, it also leads to practical problems, especially in the context of open licensing, where we need to convey information about license terms as well as information necessary to correctly attribute the creator (a requirement of all Creative Commons licenses).

I firmly believe that the way we think about and value any digital work online will change if we’re given easy access to more information about each work we encounter while we’re browsing the Internet. With support from the Shuttleworth Foundation, I started Commons Machinery in 2013 with the intent of exploring just exactly how technology can help us make these connections.

One of the first approaches we looked at in Commons Machinery was to put metadata – license and author information – on the clipboard when someone copied and pasted an image from the Internet into LibreOffice. That took some work, but we developed two extensions, one for Firefox and one for LibreOffice, that enables someone to copy-paste images from the Internet (in particular Flickr) and insert into Writer or Presenter and have the attribution happen automatically. This is especially useful in Presenter: just copy the images you need, reorder, remove, add, and when you’re done, do “Insert → Credits”. Like magic.

But it’s not without problems.

“A world of exhaustive, reliable metadata would be a utopia. It’s also a pipe-dream, founded on self-delusion, nerd hubris and hysterically inflated market opportunities.” – Cory Doctorow (Metacrap, 2001)

The quality of the metadata – even that which is entered by the photographers themselves! – sometimes leave a lot to be desired. And our approach to put this information on the clipboard only ever worked reliably on Linux-based systems. For anything else, we needed a different approach.

What we’ve been working on the past half year is a distributed catalogue of contextual information about digital works – metadata. We’re doing three things with this catalogue right now. We’re using it:

as a sounding board for metadata passing between applications. Rather than going directly from the browser to LibreOffice (or any other application), we’re routing the metadata information through the catalogue, which can be made to work on any platform,
as a repository of metadata about digital works, that can be pulled out and displayed when you hover over an image or any other digital work online, inviting you to explore more or to just get relevant details of the creator and license,
as a way to refine metadata as we go along: by crowdsourcing metadata validation, collection and refinement, we can suppress information that turns out to be false or incorrect and highlight more accurate information.

This is a work in progress, and you can follow along by joining us on Freenode in #commonsmachinery, look us up on Twitter @commonsmachine or dive into our Github.

Or, you can come to the session “Giving credit where credit is due” at OKFestival on Thursday the 17th of July at 14.00 in room K4. We’ll meet to connect between everyone who’s interested in metadata for digital works, to talk about what we’ve been doing in this space, what others are doing, and figure out how we can help facilitate an open knowledge environment by developing tools that, among other things, help automate the process of attribution. I’m looking forward to welcoming you there!

Open Knowledge Festival 2014

Credit, where credit is due

Leave a Reply Cancel reply