making a case for portable identifiers

One followup to my post on portable identifiers for songs using XSPF’s content resolution abilities happened on J. Herskowitz’ blog. I asked whether the problem in developing interoperability between music services is technical or economic. J’s answer was:

I think it is both. Since there appears to be a need for ongoing resolver work to map to lots of catalogs, the opportunity cost of one company to do so becomes too high. Just look at Paul Lamere’s work on Spiffy ( it was a great start, but he couldn’t rationalize the opportunity costs to keep it going.

As a consumer, I want it though…. I want to be able to find a playlist somewhere and then click “play” – by which enables me to determine what vendor fulfills it. Napster, Rhapsody, Yahoo, YouTube, free-range MP3s, etc.

Paraphrasing him, the value to users seems clear enough, but the work to enable it need to be shared across vendors, since no one vendor benefits more than the others. It’s social value which has to be funded by everybody and nobody.

Back here at home I asked the question slightly differently: does this technology provide enough business benefit to be worth implementing? If not, what would have to be different?

Jay Fienberg came back with an answer a lot like J’s:

I think there’s a bit of a mismatch here: catalog resolution of the type described is especially beneficial and necessary in “open” multiple-catalog systems–where the goal is linking / sharing info between as many systems as possible. And, the question is being asked of people involved in furthering the goals of “closed,” single-catalog systems.

These single-catalog systems have the goal of, more or less, focusing only on incoming links, e.g., focusing on making their single catalog a more unique authority.

I think another way to look at this would be: how hard would it be for these services expose to their own unique, permanent, identifiers to the public? (Not very, one would imagine.) Then, rather than these services building their own catalog resolution systems, they could make it possible for others to do so.

Similarly, Scott Kveton of MyStrands said: From the MyStrands perspective we’re simply not in the catalog resolution business. I would wager that Pandora isn’t either.

Jay’s trick of flipping the question around is insightful. Almost all online music businesses right now are in the distribution business, even if they see other functions like discovery or social connection as their main value, because they have no way to connect their discovery or social connection features with a reliable provisioning service from a third party. But provisioning is a commodity service which doesn’t give anybody an edge. They don’t want to import playlists from third parties because *that’s* where they are adding value.

Exporting playlists for others to provision, though, is a different story, and it makes much more sense from a business perspective. Let somebody else deal with provisioning. This is what it would mean for somebody like Launchcast or Pandora to publish XSPF with portable song identifiers that could be resolved by companies that specialize in provisioning.

Chris Anderson said:

The portability problem is a bit of a prisoner’s dilemma for music providers. If everyone addresses it, the benefit is great, but if only a few do, and in different ways, then the costs can outweigh the gains.

In the absence of a bottom-up revolution resulting in audio resources that can be resolved to, there has to be cooperation among audio brokers. Perhaps Imeem et. al. could provide an API that takes XSPF <track/> fragments and provides a flash widget with the appropriate content.

And Scott Kveton again:

What I would love to talk about is using something akin to Musicbrainz to be the public commons that companies like MyStrands,, Pandora and others can use as a basis for playlist portability.

And that’s where internet music vendors are right now: stuck waiting for ways to cooperate without disarming unilaterally. The closest thing to cooperation is that companies are willing to export Flash widgets that can be embedded in any third party site, and the reason we’re using Flash is that it allows us to define and limit points of interoperability.

Ok, so let’s just say that the business and technical problems can be factored into separate projects. Yves has been working on the technical problem of mapping identifiers from different vendors into a unified framework:

I played a bit with such lookup algorithms (using metadata+acoustic fingerprints) when I experimented linking a Creative Commons label collection (Jamendo) and Musicbrainz – this is described here, and uses a technique close to the “similarity flooding” one in the record linkage community:

Yves’ work deals with interlinking experiences based on the Jamendo dataset, in particular equivalence mining – that is, stating that a resource in the Jamendo dataset is the same as a resource in the Musicbrainz dataset.

For example, we want to derive automatically that is the same as….

It’s a fascinating and productive investigation. I am aware of at least one private proprietary effort to do this kind of thing, but no open project, and this is exactly where work has to be (as Scott says above) for multiple vendors to become interoperable without unilateral disarmament. One immediately useful result of this work is to make a direct connection between the XSPF concept of content resolution and the semantic web concept of Equivalence Mining and Matching Frameworks. This allows music developers familiar with the application domain of catalog management to benefit from high-academia research into techniques that can be used to auto-generate links between data items within different datasources.

Phew. I’m done. This was a hard post to write because I had to digest all the different strands in this conversation. It took a long time to figure out what people were talking about. Still, now that I’ve done the legwork I feel like I understand the problem better than before, even if parts are still a complete mystery.

8 thoughts on “making a case for portable identifiers

  1. J’s point captures the upshot well: It is users who stand the most to gain from interoperability. I’m excited because it seem like users are getting more and more power lately, especially with the browser data-store that Mozilla Weave will become.

    I wrote a short post outlining how Weave-like technologies could allow users to manage the playlists (as files) and delegate audio provisioning and recommendation to any services they choose.

    When users control the data, they’ll be able to interact with services on their own terms. This will change things everywhere, and the music web stands to be a shining example of what can happen to an industry when users start calling the shots.

  2. What about building on top of the provisioning work the users are already doing? As things currently stand, the most common playlist application involves a user pulling together media from all over and then putting it together into custom playlists for local listening, organization, and sharing using desktop music library software (read: iTunes). Right now these playlist applications are almost entirely offline, but what if there was an avenue into them for any enterprise that wanted to offer services on top of this existing media library?

    Picture this: imagine if iTunes could resolve xspf against a user’s local music library and provide a programmable interface for other vendors to provide additional information services keyed off the available metadata. A whole spectrum of services would bloom which are currently hobbled by the legal and economic obstacles an enterprise encounters in having to create b2b relationships with the existing rights holders: playlist sharing, universal metadata-based search, music discussion and annotation, etc.

    One unique advantage that the users have over any private business or public open source effort when provisioning is that they can travel both the lightnet of music bloggers, band sites, etc. as well as the darknet places forbidden to enterprises such as p2p networks and hand-to-hand exchange of CDs. Since the latter is where much of the most popular (and most protected) music resides (specifically, the RIAA’s releases), users always see a gaping hole in any catalog based entirely out of the lightnet. The application I’m imagining would allow them to fill that hole for themselves while still staying networked enough for enterprises to be able to provide them with additional valuable services.

    The Songbird project is kind of the closest thing to what I’m picturing here, but its slow pace of development, confusing licensing situation, and choice of high barrier to entry technologies for extension has prevented it from being usable and useful enough to really scratch this itch.

    But, I do think that one of the big implications of your description of this territory is that it points out a major vulnerability in the iTunes empire: the wealth of services and value waiting to be built on top of an accesible networked content resolver media player.

  3. I’m wondering if we’re close to being able to do something about this. I think the music industry is finally, ever-so-slowly, shifting course on these things and the iTunes juggernaut makes them really wary of all-of-their-eggs-in-any-one-basket kind of a solution.

    Couldn’t we just solve this ourselves? Get Pandora, MyStrands, Lucas, Chris, Greg, etc … come up with something and knock it out?

    I still think the wikipedia-ish solution is the best … make the data available via a creative commons license for commercial/non-commercial use, allow artists and users to update/add/tweak the entries as needed. You don’t start with everything, you do the basics first and go from there.

    From a sustainability perspective, I think you run it as a non-profit supported by donations, sponsorships and members. I can think of a bunch of companies that would love to have a set of catalog data they could sync to their own catalogs or better yet, just use outright.

    You wouldn’t be trying to create social content here. You’d be building a catalog that could be the basis for better/more experiences.

    I still feel like we’re cavemen banging two rocks together with respect to this stuff when there are lighters and matches laying all over the place.

    I’m in if anybody else is game.

  4. Chris, I agree that weave is a powerful direction. I could see a whole new generation of web software building on another round of browser innovation like this. What excites me about it is that it might allow no-budget projects to scale up much further than they can right now. Google Gears is also going in that direction, though I think that its affiliation with a specific web company will make it less likely to get uptake.

  5. Greg, it strikes me that locker services like MP3Tunes are already compiling catalogs of local content. I wonder if there’s a way to bootstrap a local resolver off that metadata? If MP3Tunes has an API it would probably do the trick.

    …just looked it up and found that it is promising:

    About Songbird, they are now getting serious about implementing XSPF, but I think only for URLs and not for content resolution. To do content resolution I think that you’d need to write C++ against their API. That’s a pretty high barrier to entry, I agree.

  6. MP3Tunes search interface:

    Search results include playable URLs:

    Results are available in JSON:

    Combine that with JSPF:

    And you get a local content resolver in pure AJAX using odds and ends that are already available:
    1) Get an XSPF file with metadata fields filled in
    2) Parse it using JSPF
    3) For each track with metadata, call mp3tunes search API
    4) For each search result with a playable URL, fill in the corresponding track location in the parsed XSPF
    5) Activate an in-page player which can read an auto-generated playlist

Leave a Reply

Your email address will not be published. Required fields are marked *