I don’t know if the world really needs higher fidelity audio. I doubt it.
But it would be useful to be able to choose a version of a file that was mastered to match your listening gear. One version might be mastered for the classic low end white earbuds that come with an iPod. The other would be mastered for good speakers in a room with reasonable acoustics.
This would be different than EQ settings customized for the listening device. EQ is only one of many tools a mastering engineer uses. For example you might want more compression for iPod earbuds, because quiet sounds are inaudible and loud ones distort. Or you might want less reverb on earbuds, because it makes it harder to distinguish a fast series of staccato notes.
Choosing these settings isn’t really a job for an automated system. A human needs to drive it. That’s why people find the money to pay mastering engineers. So you can’t just build in a standard switch in the player – there would need to be two files to choose between.
I’m imagining an audio file format that allows you to switch between different masters depending on context. They would both be in the same file. It would be basically a multitrack file where the different track types were well known in advance – like “hifi” and “lofi.”
For Music Hack Day in San Francisco this past weekend I did a hack related to my blog post on “hyperaudio notation”. My idea was to caption a recorded song using music notation, as an instantiation of ideas like hyper video, hyper audio, popcorn.js, and WebVTT.
There is a recording and a score. The recording is an MP3, the score is a PNG. The purpose of the system is to move a highlight through the score in sync with the MP3, so that the listener can see which part of the notation in the image is currently being played. It’s like text captions for a person talking.
I could have designed it to show just a portion of the overall score, but showing the entire image with a moving highlight was easier.
To move the highlight in sync with the music, you train it. Pushing a button marked “start recording” initiates a training run. The music starts, in time with a recorder for clicks within the image. When you click in the image the time and location are recorded. The trainer clicks in the image in sync with the music. When the first bar is played, click on the first bar in the image. Continue until you have provided music captions for as much of the song as you want. Then press “stop recording.”
At this point, press the “play recording” button to rerun the training session.
The vision is that the training would be done by the person publishing the page, and visitors would just use the “play recording” button.
To see it in action, go to the live demo code or view a screencast I made. (The live code was super quick and dirty and assumes that you have exactly the same everything I do, including browser, bandwidth, etc. Chances that it will actually work are slim).
I emailed Sylvia Pfeiffer about the implementation status of WebVTT. When can developers use it, if not immediately? She replied that there is support (probably incomplete) in the IE10 developer build, work is underway in WebKit, and Opera’s status is unknown. She recommended that web developers who need captions right away should use a polyfill such as captionator.js.
I’m reposting our conversation to make the answer Google-able for other developers looking into WebVTT, at least in the short term before this information goes stale.
What’s WebVTT? Captioning for online video. See this talk by .
Bruce Warila wrote:
Reading comments around the Internet, a common theme from legal types is that “DMCA takedown” is an adequate (legal) mechanism. Up until this morning, this generated a “huh?” response in my head. After retreating from Starbucks, it occurred to me that DMCA takedown could be just fine if automated-monitoring-and-DMCA-takedown machinery existed, then yeah sure. Give me a dashboard and charge me $10 per song / per year to fling automated takedown notices at random services that I haven’t authorized; back it up with ‘class-action’ protection; and I think someone could make a serious business out of this?
I wonder if I could stand to work on such a system. Would I be forced into a position of doing things I consider wrong? Or would I be making the world more fair?
I doubt alarm monitoring services are kept up. I think most of those signs you see out front of houses – “Protected by [BRAND X] security system” – are about systems that have fallen into disrepair.
That’s a guess, but I’m pretty confident about it.
You need to keep changing the batteries in every single component. Every entry way with a sensor and every room with a motion detector now has a battery. A house can easily have 20 new batteries to change. Each of these components is manufactured cheaply and most will need to be replaced or upgraded at some point.
The codes need to be kept secret, or they need to be changed from time to time. The safe word needs to be taught to everybody and then remembered. The kids have to be kept from hitting the panic button or arming the system using the one-click button on your keychain. And of course you need to keep paying the alarm monitoring service, for life.
What’s the alternative? Security measures that are permanent. One-shot investments.
Good quality locks. Fix the strike plate to make it hard to kick the door in. Get rid of glass windows within arms reach of a doorknob.
Once you make a fix like that, it keeps on repelling thieves indefinitely.
Usecases for WebRTC:
4.2. Browser-to-browser use-cases . . . . . . . . . . . . . . . 4
4.2.1. Simple Video Communication Service . . . . . . . . . . 4
4.2.2. Simple Video Communication Service, NAT/FW that
blocks UDP . . . . . . . . . . . . . . . . . . . . . . 5
4.2.3. Simple Video Communication Service, global service
provider . . . . . . . . . . . . . . . . . . . . . . . 5
4.2.4. Simple Video Communication Service, enterprise
aspects . . . . . . . . . . . . . . . . . . . . . . . 5
4.2.5. Simple Video Communication Service, access change . . 6
4.2.6. Simple Video Communication Service, QoS . . . . . . . 7
4.2.7. Simple Video Communication Service with sharing . . . 7
4.2.8. Simple video communication service with
inter-operator calling . . . . . . . . . . . . . . . . 8
4.2.9. Hockey Game Viewer . . . . . . . . . . . . . . . . . . 8
4.2.10. Multiparty video communication . . . . . . . . . . . . 9
4.2.11. Multiparty on-line game with voice communication . . . 10
4.2.12. Distributed Music Band . . . . . . . . . . . . . . . . 11
4.3. Browser - GW/Server use cases . . . . . . . . . . . . . . 11
4.3.1. Telephony terminal . . . . . . . . . . . . . . . . . . 11
4.3.2. Fedex Call . . . . . . . . . . . . . . . . . . . . . . 12
4.3.3. Video conferencing system with central server . . . . 12
Digging deeper on one of those:
4.2.12. Distributed Music Band
In this use-case, a music band is playing music while the members are
at different physical locations. No central server is used, instead
all streams are set up in a mesh fashion.
Discussion: This use-case was briefly discussed at the Quebec webrtc
meeting and it got support. So far the only concrete requirement
(A17) derived is that the application must be able to ask the browser
to treat the audio signal as audio (in contrast to speech). However,
the use case should be further analysed to determine other
requirements (could be e.g. on delay mic->speaker, level control of
audio signals, etc.).
p.s. That’s right, I used the <pre> and <big> tags TOGETHER. MWAHAHAHAHAHA.
When it comes to infringement control, there aren’t many competing visions.
One vision is to kill the internet. Another vision is to kill the incumbent media business. Another is to reinvent the media industry around tax revenues. SOPA, Pirate Bay, deus ex machina.
Here is my vision: allow private law to flourish. Enforce contracts. Introduce the rule of law.
DMCA notice and takedown is basically an excellent system for administering a global-scale internet music infrastructure.
There are just two problems, and both can be handled as incremental improvements.
One is fraudulent takedown requests. There are no meaningful barriers to requesting that something not infringing be taken down. This allows incumbents to do a denial of service attack on anybody who actually wants their music to be up. I don’t think these are usually deliberately fraudulent, I think that they are accidents.
Easy fix: penalties that are big enough to cause rights holders to care.
The other problem is scaling up the takedown request machinery. For the moment the process is manual and rights holders can’t go fast enough to make a dent. They need to be able to spider newly posted content, to accurately diagnose whether it is infringing, and to generate a takedown request, all at internet scale and speed. This is technologically possible, I believe. (But I won’t document a system to do that here, because it would be tedious and not very interesting).
Ian Roger’s proposal
I appreciate Ian’s proposal for a global scale rights registry, but I think it is far harder than just making adjustments to DMCA notice and takedown. Ian’s strategy requires rights holders to make their catalogs available to all comers at well known prices. This sacrifices negotiating power, so they will have to be dragged to the system kicking and screaming. Notice and takedown, on the other hand, is something they’re willing to do as long as it actually works.
What’s somewhat amazing about a system based on notice and takedown is that it administers itself in a completely decentralized way. Each rights holder would have its own registry of permitted URLs. Sony Music would have sony.com in this exception list, for example. Any domain not in the exception list would get a takedown request. The exception list can be highly detailed; for example it can remember which tracks a site is permitted to host and which tracks it may not host. This is mature technology.
The system I propose here can be implemented without any leaps of imagination. The details would be pretty easy, or at least practical.