license claims in HTML

When you put a Creative Commons license in a web page, it usually applies to that page. For example, if you generated HTML for the Attribution-ShareAlike license using the license chooser at CreativeCommons.org and put that claim into a web page at http://example.com, it would mean that the page at http://example.com could be freely shared as long as there was attribution and the sharer applies the same license to their copy.

By using the “about” attribute specified in RDFa, you can modify that claim HTML so that it applies to a different URL and not the page in which the HTML is embedded.

Let’s say you have a media file “my.mp3” (which may or may not have embedded license info), it is online at http://example.com/my.mp3, and you have a web page at http://example.com. Let’s also say you have a chunk of HTML for saying that the current web page is under an Attribution-Sharealike license.

Your web page containing that chunk would normally have HTML along these lines:

      <html><head><title></title></head><body>
      [the HTML for the license claim]
      </body></html>
    

The modified HTML would look like this:

      <html><head><title></title></head><body>
      <div about="http://example.com/my.mp3">
      [the HTML for the license claim]
      </div>
      </body></html>
    

This is a new way to publish a license claim for a media file. The existing way is to embed the claims into the file using a tool like liblicense. The reason you would use the new method is that the benefits and drawbacks are a better match for your needs.

Pros of embedding within media files:

  1. A license claim inside a file travels with the file, so that the license claims on the copy are still identifiable. If you use the external HTML method, the only way to tell that a copy at a different URL is under the same license is to do a byte-for-byte comparison of the files.
  2. A license claim inside a media file is instantly accessible to any program which is already accessing the file and only slightly less accessible to a program which already has a copy of the file. A license claim in external HTML requires the HTML page to be found, fetched, and parsed.

Pros of using an external HTML file:

  1. A license claim embedded in a media file can only be recognized by fetching the file and parsing it. AJAX techniques usually can’t be used to parse a binary file. Bandwidth and latency limits may also prevent this. In contrast, an HTML file can be parsed by JavaScript, and is often small enough that bandwidth and latency are not a problem.
  2. A license claim inside a media file is hard for web spiders to see, and most search engines won’t index it. In contrast, a license claim in HTML is easy for a spider to see and all search engines will index it.
  3. A license claim inside a media file requires a dedicated program like liblicense on the client side to edit. A license claim in HTML can be generated using a simple web application like the license chooser at CreativeCommons.org, and any decent content management system (like Drupal or WordPress) could easily do it.

You don’t have to choose between these methods. There is no reason why these two methods can’t be used together, which would give you the good parts of both.

As with all implementation proposals, this method may not work. It may be that the RDFa “about” element isn’t widely available enough, given that it is specific to XHTML 2 as far as I know. It may be that the rel-license microformat can’t be extended like this.

There’s one improvement to this method that I don’t know how to do — making it work in existing search engines with no changes on their part. If it’s possible to tweak the HTML syntax so that existing search APIs or query arguments could be used to find Creative Commons works, the entire open media ecosystem would benefit.