Take, for example, a database that stores a user’s ZIP code, gender, age and model of car. On their own, these things sound anonymous. But if the ZIP code has 20,000 people, gender narrows that down to 10,000. Age could cut it down to a few hundred, and once you add model of car, you could be looking at a handful of people. Add other characteristics, like specific browser type and computer operating system, and you may be describing just one individual.
How many pieces of information are needed to identify an individual? In the field of re-identification science, it’s 33 “bits,” specifically “33 bits of entropy.” (Information-science researchers refer to random pieces of information as “entropy.”)
Could you consider those bits an identifier? Could you call yourself “the 22 year old male in the 02130 zip code who drives a Ford Bronco”?
For some purposes, yes.
You could pick up messages left for that identifier as long as it didn’t matter whether somebody else read them. This identifier would be accurate enough to enable you to find messages that were intended to reach you.
Needless to say this ID scheme is decentralized. There isn’t a provider of these identities – individuals make their own by moving to a particular zip code or selecting a particular car.
But could anybody remember these identifiers? I think so. The hard part is zip codes, but I think that’s doable.