There’s video captioning with WebVTT, and there’s a closely related vision of hyperaudio.
This made me think about a music-only application of hyper audio and captioning – synchronizing music notation with a recorded performance. How would it be a different thing than synchronizing lyrics with music, or a transcription with a movie?
Notation sometimes contains a complete map of the performance. Everything is written up in some way, including the intro, solos, and all the little bits. This style of notation gets out of sync with a performance easily. It would fit with performances read from that score or with scores transcribed from a performance. Automatically matching the written part to the recording would still be tricky, because of how the tempo affects them both.
Notation is often deliberately incomplete. It describes certain highlights: here’s the main melody, here’s the order of the parts, here’s the guitar solo. This kind of notation would be fragments inserted at multiple different points.
But then there’s the issue that notation is not an internet standard. As far as the Internet as a whole is concerned, relatively open approaches like MusicXML and the Lilypond format are just as opaque as a bitmap of a scan of a handwritten score.
And to the point of WebVTT, the stuff it’s synchronizing with the media is text. It’s not for random binary objects as far as I know.
Whatever the obstacles, it’s plainly useful to synch written music with recorded performances. You might be annotating the performance to make a point to musicians. You might be illustrating the notation to make it easier to sight read. You might be enabling web search for recordings of a melody.
This idea seemed fairly bland when I sat down to write this. Not so much at this point.
You can always use WebVTT with @kind=metadata to provide music notation in chunks (“cues”) to the audio file and then use JavaScript to interpret the result. If you wanted to put binary blobs into WebVTT cues, then your JavaScript would need to know how to present them and how to switch between then. If it’s for example just pngs, then you could use data urls.
Thanks for replying, Silvia. I was thinking of emailing to ask.
@kind=metadata is a good semantic, because it fits the use case perfectly.
How would you recommend linking to a png that’s too big to fit comfortably in a data url?
I recall be awed by the rendering in http://0xfe.blogspot.com/2010/05/music-notation-with-html5-canvas.html , but a little dismayed at the prospect of yet another markup for notation. I am somewhat sold on EITHER the basic approach to markup taken by Lilypond, or plain old JSON.
@kind=metadata ->
http://dev.w3.org/html5/webvtt/#webvtt-metadata-text
This might help, too: http://0xfe.blogspot.com/2010/05/music-notation-with-html5-canvas.html
This makes me muse, a bit off topic, of the beats per minute inherent in visual images, and how it is analogous to but unlike frames per second.
I don’t follow, gurdonark. Can you say more?