jwheare’s web of music and the Media URI spec

Towards a web of music:

Playdar isn’t really a browser. It’s more of a search engine. But can you imagine using a web browser today that didn’t have Google built in? The idea of a browser goes hand in hand with that of a search engine; in my efforts to relate the two I may have blurred the waters.

I’d like to talk a bit more about what the web looks like today, and how we can make it friendlier to the idea of a music browser.

So far, this has led to a web of control, with content centralised in the hands of publishers. The distribution of creative works is stifled, not to ensure the protection of rights, but due to a careless muddling of formats. And this sloppy integration of multimedia into the browser ensures we’ll be mired in codec hell for years to come.

Can we do better? Can we create an ecosystem for browsing, subscribing, sharing, discussing, listing and rating the stuff of the web that’s separate from HTML? Music is an obvious area of opportunity, and we’ve already got music browsers that have escaped the gravitational pull of the browser. The problem is, they’re still locked into the publisher’s web of control. The iTunes Store and Spotify represent bold new ways to access a wealth of music, but they’re essentially blind to a world of sound outside their borders.

(via James Wheare’s Blog | jouire.com)

Audio itself isn’t hypertext. I wonder what it would be like to have music audio be hypertext. Would you click on a Led Zeppelin lyric to go to the blues tune that it came from? Maybe you could navigate from a chord progression in one song to the same chord progression in other songs.

When you share a set of songs by putting them all in the same MP3, you can’t address them individually. When you use a playlist, the playlist is a hypertext container for each song.

And then there’s Media Fragments URI 1.0, which is a specification for pointing inside of multimedia files. That’s full-fledge music hypertext, even though it is used for audio bytes rather than musical songs.

Which makes me speculate about metafiles that attach semantics to audio files. The metafile would link to a range within an audio file in such a way that high level concepts were communicated. Like:

<div class="song-meta-map">
<a href="example.mp3#t=0s,15s" class="intro">intro</a>
<a href="example.mp3#t=16s,45s" class="verse">first verse</a>
<a href="example.mp3#t=46s,75s" class="verse">second verse</a>
<a href="example.mp3#t=76s,106s" class="chorus">chorus</a>
<a href="example.mp3#t=107s,137s" class="solo guitar">guitar solo</a>
<a href="example.mp3#t=138s,178s" class="chorus">chorus</a>
<a href="example.mp3#t=179s,210s" class="outro">outro</a>

My Web of Songs deck is different that jwheare’s concept in that it doesn’t conceive of audio-specific forms of hypertext, but thinks about general systemic issues preventing music from being a first-class citizen of the web.

5 thoughts on “jwheare’s web of music and the Media URI spec

  1. When you think about it, a technological component in a media player can auto-magically beat-sync two tracks by comparing basic structure and determining BPM. Word documents used to be the bane of the structured data movement, because they trapped content in a non-structured format, but ODF and OOXML have changed that game completely, creating a new class of semi-structured data; so why not music or video?

    It’s fascinating to consider that if more artists released works under CC-NC by attribution, remix artists could provide additional value by micro-tagging individual samples within the deeper structure of their compositions – particularly if this functionality were baked into the software used to assemble the composition.

    In addition, isn’t the original theory behind Pandora based on linking chord progressions and such, or is it more general? I never really got a bead on what Pandora was actually doing.

    Also, hang the DJ and all that – I rather enjoyed beat-syncing vinyl, before the robots took over. To quote Johnny Guitar Watson, maybe we do need to strike on computers!

  2. MP3 files can contain text, of course, and I’ve occasionally found lyrics stored inside TEXT and USLT frames. But there’s no consistency at all, probably never will be – more likely to find spam inside a TEXT frame.

    Your idea for linking to time points is a cool notion, Lucas. Related to this, Real’s servers provide for a “start” parameter on a/v URIs, allowing one to jump to a time point, e.g.


    Some of the various SMIL specs provide begin and end params for the same purpose (http://is.gd/5I3jL). Aside from that and Real’s faded format, my hunch is that most a/v is not very content-addressable, partly due to the fact that a given song can be found in the wild with many encoding variations. If I make in/out time points for lyrics on my rip of a CD track, your rip might not sync with it. Also, radio vs. album versions of a song may vary in duration and content.

    Event-based synchronization, i.e. the beat-counting idea Piers brings up, might be worth looking into-

    <a href=”example.mp3#t=1017b,1683b” class=”chorus”>chorus</a>

    This would need a filter to recognize beats and count them. Possible, just not as simple as time. Might be more consistent than seconds-based.

    Perhaps there’s another type of common event found in audio streams that could provide consistency, but I like drum beats because they’re less likely to get corrupted or folded than high frequencies, and less common than human voice-range freqs.

    The karaoke industry seems to have cracked this nut, but I’m gonna hazard a guess that it’s all proprietary.

    These guys sell player sw that syncs lyrics for 1 million songs, they claim: http://is.gd/5I48w . They appear to target music teachers in their marketing.


  • Kev & Piers on hypermusic — Lucas Gonze’s blog
  • music web and client side remixes — Lucas Gonze’s blog

Leave a Reply

Your email address will not be published. Required fields are marked *