When you look at data, you can look at it from one end of the telescope or the other. Holding the telescope normally, you can get detailed information about an individual. If you access health records from the early part of the 20th century you could get information about people who are still alive. Such data may be so detailed as to make it personal, but such information will have little importance other than to the individual. If you turn the telescope around and look at aggregated data, you get a much bigger picture, one which allows you to make decisions about large groups of people. However, since detail is a hindrance aggregated data means that the individual cannot be seen.
The announcement that the government intends to hand our medical records over to private companies worries some people because they feel their privacy will be invaded. Can you say that the data that the health visitors gathered a century ago was an invasion of the privacy of those children? In response to the government's plans to hand NHS data to private companies the campaign group Patient Concern said:
'This is the death of patient confidentiality. There is no guarantee that information will be anonymised. In any case, anonymised data can just as easily be re-identified.'This is nonsense. Anonoymising data is one-way: once you have removed identifying data there is no way that someone else can put it back in. The suggestion from Patient Concern is as believable as homoeopaths claiming that water has "memory", the campaign group seems to be suggesting that patient data has "memory".
However, as I explained in my last blog, it depends on how you interpret the data. For example, a friend was an administrator in the transplant unit of a large hospital. At that time George Best needed a new liver. My friend told me that in his hospital a young person died. This person had indicated that they wanted to donate their organs and their family had agreed too. The person was an appropriate tissue match for Best, but the family were not told this because donations are anonymous, however, they were told that the young person's liver had been transplanted. This tiny piece of information was significant because a few hours later the media reported that Best had received a new liver. The family could easily put two and two together, and (so my friend told me) were upset that their child's liver would be transplanted into someone who would abuse it (as proved to be the case).
A small piece of information - the organ that had been transplanted - allowed the family to find out who had received the organ. While it is possible to anonymise a process, or to anonymise data, it has to be done carefully, particularly when you are handing a single patient's anonymised data. This does not mean that anonymised data can "easily be re-identified" but it does mean that sometimes people can make intelligent guesses. When data is aggregated even more detail is lost, and even the most intelligent cannot make guesses about the individuals involved.
@NO2ID tweeted this article from the security guru, Bruce Schneier (if you have the chance, read Schneier's "Beyond Fear"). In the article he reports studies that have essentially used external data to make "intelligent guesses" with online databases.