jwheare’s web of music and the Media URI spec

MP3 files can contain text, of course, and I’ve occasionally found lyrics stored inside TEXT and USLT frames. But there’s no consistency at all, probably never will be – more likely to find spam inside a TEXT frame.

Your idea for linking to time points is a cool notion, Lucas. Related to this, Real’s servers provide for a “start” parameter on a/v URIs, allowing one to jump to a time point, e.g.

http://play.rbn.com/?url=demnow/demnow/demand/2009/dec/audio/dn20091231.ra&proto=rtsp&start=00:28:56

Some of the various SMIL specs provide begin and end params for the same purpose (http://is.gd/5I3jL). Aside from that and Real’s faded format, my hunch is that most a/v is not very content-addressable, partly due to the fact that a given song can be found in the wild with many encoding variations. If I make in/out time points for lyrics on my rip of a CD track, your rip might not sync with it. Also, radio vs. album versions of a song may vary in duration and content.

Event-based synchronization, i.e. the beat-counting idea Piers brings up, might be worth looking into-

<a href=”example.mp3#t=1017b,1683b” class=”chorus”>chorus</a>

This would need a filter to recognize beats and count them. Possible, just not as simple as time. Might be more consistent than seconds-based.

Perhaps there’s another type of common event found in audio streams that could provide consistency, but I like drum beats because they’re less likely to get corrupted or folded than high frequencies, and less common than human voice-range freqs.

The karaoke industry seems to have cracked this nut, but I’m gonna hazard a guess that it’s all proprietary.

These guys sell player sw that syncs lyrics for 1 million songs, they claim: http://is.gd/5I48w . They appear to target music teachers in their marketing.