In today’s installment of Re-identification of “Anonymized” Data is Often Too Damned Easy, Naked Security reports on a re-identification study by Latanya Sweeney and Ji Su Yoo:
…. A pair of researchers at Harvard University wondered if similar problems might exist for South Koreans, whose RRNs [Resident Registration Number] are used as database indexes in all walks of life, such as applying for a credit card, looking for a job, paying taxes, and using medical services.
In particular, anonymized medical prescription data on millions of living South Koreans is shared with IMS Health, a large multinational corporation headquartered in the United States.
Could these supposdely-anonymous records be wrangled backwards to reveal that abcdefghjiklm, say, who’s been taking protease inhibitors and reverse transcriptase inhibitors for the past decade, is, in fact, Mr Hong Gildong of Incheon?
As the researchers note, this is an issue of real concern in South Korea:
In February 2013, approximately 1,200 physicians and 900 private individuals filed a civil lawsuit against IMS Health, a large multinational corporation, for collecting the medical data and RRNs of millions of living South Koreans. IMS Health claims that it does not violate the privacy of South Koreans, in part because the RRNs are encrypted and cannot be decrypted by any reasonable means. Is this correct? Is it possible to decrypt these RRNs and associate actual RRNs with the medical information of patients to whom they belong?
So the researchers looked at available anonymized prescription data on deceased South Koreans, which they claim resembles the data supplied to IMS Health.
The results were shocking.
The data included 23,163 allegedly-anonymized RRNs, and with very little effort, the researchers de-anonymized the lot.
Actually, it’s much, much worse than that.
Read more on Naked Security.