You are bringing together a lot of different goals, and an important thing to recognize is that you’re dealing with users, content and context in a domain (music) where there are different types of users (inidividual and collective artists, music collectors, fans, etc.), different types of content (pop songs, albums, classical tracks, new forms, etc.), and a context (the WWW) that’s totally multifaceted. So, there are a lot of ways to answer your question.

But, getting very abridged and sofrware-oriented, you are basically talking about software that, generically, does content and asset management. And, while the devil is in the details, a generic way to handle this is based on a template system: different types of content / assets have different templates, and different types of users may use different levels of templates (e.g., minimal data or lots of data templates).

And, ideally, people can add their own templates to meet their own needs, e.g., “open ended” gets handled as “extensible.”

So, for example, my music template and your music template can be different, because we have different approaches to how we want to communicate about our musics, the music forms themselves are different, and we probably also have some different ideas about how we want to web our musics.

In terms of effort, the system would ideally allow data / info entered for related bits of content / assets to be shared, e.g., you enter the song title in one place and it ends up in each file and in the annotations. Asset-oriented data like song length and file size could be automatically extracted from files into the annotations. Contents-oriented data / info like liencensing could be automatically extracted from text annotations into the files.