26 September 2014
I am today attending the first Internet Privacy Engineering Network (IPEN), where the issue of translating Data Protection principles into requirements has been raised a number of times. While this exercise needs to be repeated for each given service or application, it reminded me that I had drafted a number of generic Technical Requirements for Processing PII. These need to be reviewed and validated, but I hope they offer at least proof that the problem can be made tractable.
30 March 2011
My team at Microsoft research has spent the past 6 months grappling with the problem of privacy in next generation energy systems. In parallel with the good honest scientific work I also participated in the UK government consultation on smart metering, in writing and in person, specifically on the issue of privacy. Its conclusions have finally been made public (see DECC’s site and Ofgem’s detailed responses).
First, what is the problem? Smart-meters are to be fitting in most homes, and they provide facilities for recording fine-grained readings of energy consumption. These are to be used for time of use billing, energy advice, the backend settlement process, financial projections of suppliers, fraud detection, customer service, and network management. The problem is that these readings are also personal data, and leak information about the occupancy of households, devices used, habits, etc. So here we have a classic privacy dilemma: where to strike the balance between the social value of sharing data (even mandating such sharing) versus the intrusion to home life?
Or do we? As it is often the case when privacy is framed as a balance, what is ignored is that we can use technology to achieve both privacy and extract value from the data. In fact we show no balancing act is necessary. We designed a host of privacy technologies to fulfill the needs of the energy industry (even the rather exotic ones) while preserving extremely high levels of privacy and user control. Lets look at them in detail:
- We developed a set of protocols to perform computation on private data while maintaining a high degree of integrity and availability. This allows customers to calculate their bills, provide indicators of consumed energy value to be used in settlement, routing demand response requests, and do profiling to support network operation or even marketing. Our framework guarantees that the computations only leak their results to third parties, and also that those results are in fact derived from the real meter readings. The raw meter readings are not necessarily shared, but can be used locally on any user client to offer a rich experience — i.e. pretty graphs of consumption and comparison with their neighbours. A non technical overview is available as a white paper, a technical introduction for meter manufacturers is provided, and a preliminary technical report with all the crypto is also online.
- Sometimes it is important to aggregate information from multiple meters without revealing anything about individual readings. The traditional approach has been to give all readings to a trusted third-party that performs the aggregation and only publishes the sum. We show that a set of meters can in fact perform the aggregation without the need for a trusted party. This is simple, efficient and compact — the computations can be done inside the trusted meter or outside along with cryptographic verification. All details are available in our technical report on aggregation.
- Some smart-meters may be deployed in extremely high-security settings. In such places leaking even the final bill or statistics aggregated over time may leak information and a positive guarantee that the information leakage is limited might be necessary. We developed techniques inspired from differential privacy to inject noise to aggregate readings that guarantee any specific time period consumption is masked. Further more we allow customers to recuperate the bulk of the costs though an oblivious cryptographic rebate system. Our technical report on differential privacy and rebates in metering is available.
- Finally proving that protocols are correct is not sufficient, so we explore options for proving actual implementation of the protocols are in fact providing the necessary security and privacy properties. A report on the certified implementation of a variants of the proposed protocols using refinement types is also available.
The project web-page on privacy in metering links to all those any more.
So much about the science, what about the engagement with government. On the positive side, our rather limited goal has been achieved: we wanted to put privacy technologies, that provide solutions beyond the dilemmas and balance between privacy and value, on the map. The government response to the consultation takes note, in a limited way, of the potential use of privacy technologies. On page 10 it shyly mentions that:
“2.18. Work is in process to understand the options for aggregating or anonymising smart metering data and whether it is necessary for the data to be accessed by a party that carries out the data minimisation. Privacy enhancing technology can potentially enable anonymised or aggregated data to be provided without any party having access to the personal data itself. The programme will work with industry and academics in order to explore the applicability of privacy enhancing technologies within the smart metering system.”
This is actually a rather fair representation of the capabilities of the technology, even if it is presented as a far away goal, rather than the concrete protocols we have proved correct and the implementations we have built.
Paragraph 2.18 mentioning privacy technology is a ray of light amidst an otherwise ambivalent government response. On the up side it recognizes energy consumption as private data from the onset, it mandates meters to hold 13 months of consumption and provide local access to it, it defines narrowly the scope of data that can be gathered without explicit consent and puts them under the data protection regime. On the down side there is confused language about what constitutes personal data (2.17), and the final technical solution involves collecting data in clear through a centralized systems (the glorious DCC) and protecting it using access control — a far cry from what we know possible in terms of technical privacy protection.
The metering privacy geeks (legal & technical) might also find other interesting nuggets in this report:
- It mentions privacy-by-design, but without support for privacy technologies (except a mention of aggregation in 2.14). This is a damaging trend set by the Ontario report on privacy in the smart grid that takes a purely management approach to privacy in the local smart grid deployment. A response to this trend is provided by Prof. Claudia Diaz and her colleagues that highlights the technical protections necessary to engineer privacy-by-design. This is only the start of this tussle.
- The report seems to suggest that personal data is not personal if it is not readily identifiable by the data controller (sect. 2.17 and 3.7). This is the classic argument of “what is de-identified personal data”. Does it mean the data controller cannot identify it, or anyone in the world? It seems the government is as confused as everyone else on this matter.
- The key outcome of the consultation is that the energy industry needs some data to perform “regulated duties”. This concept was present in the initial consultation, but funnily enough there was no description of that those duties were. It transpired in meetings that Ofgem was not in fact clear about what they were, and a large part of the consultation centered around fleshing those out. A list of those duties is available in Appendix 3 of the report, and is probably welcome by all (a similar list is available in the NIST privacy reports).
- So (in 3.15) the government concedes that industry must have access to the data necessary to perform its regulated duties by default, yet this data should be subject to the DPA requirements (3.16 for example specifically calls principle 5 — that the data should not be kept longer than necessary). Well that is a mine field: it is clear that the data is collected for a specified purpose (principle 2). If the other principles are also applied it means that it should not be used without explicit consent for other purposes (*cough*added value services*cough*) and furthermore it should not be excessive for the stated purpose. Well here we are: our technical reports offer ways in which most of the stated purposes in appendix 3 could be fulfilled without collecting the data. Is this a contradiction? Not automatically. The government’s view is clearly that our proposed protocols are not yet ready for prime time — of course as these technologies become better known and deployed this objection will evaporate. Will the data minimization requirement then mandate the use of privacy technologies? This is a rhetorical question at the moment.
- It is interesting to note that the restrictions associated with limiting the automatic collection of data by suppliers was possibly set in place on the grounds of market competition rather than privacy per-se (section 3.32). Automatic collection by suppliers would put them in an advantageous position vis-a-vis third-party providers of value added services. This is an open issue (3.36).
- The government is keen for a local repository of consumption data in the meter (4.6) and the use of geeky toys to visualize it (4.12). This is the setting in which our solutions enable strong privacy guarantees. That is positive, if only half-way.
In conclusion, the debate around privacy in metering has been informed by consumer concerns, privacy concerns, industry needs and technology alternatives. They are all represented in the government response. Yet the final solution is rather conservative: it relies on a centralised conduit for personal information protected by access control layers and management layers. It is far from what we know possible with privacy technologies. The argument today is that those technologies are too new — which is questionable given how quickly IT inovations are brought to market. This argument will lose its potency in the long term if we keep developping and deploying privacy firendly solutions.
18 August 2008
Last month I attended the Privacy Enhancing Technologies ( PET 2008 ) Symposium in Leuven, Belgium. The programwas fantastic, with a strong focus on anonymous communications, and many papers on traffic analysis. The associated HotPETS event, was also very fun, with plenty of time for discussion, and the added advantage that all the papers are on-line.
A paper that had to catch my attention was entitled “Breaking and Provably Fixing Minx” by Erik Shimshock, Matthew Staats, and Nicholas Hopper, that shows an attack against the Minx scheme Ben Laurie and myself had proposed back in 2004. Minx is a cryptographic packet format to be used by anonymous remailers (or mixes) for high-latency, email like, communication. It was designed to be space efficient, meaning that we radically cut down on the padding and redundancy within the packet, and used raw RSA.
That last use of raw RSA proved to be a bridge too far: recent results show that all bits of RSA are hardcore, meaning that if you do not know the key you cannot guess them. Sadly the inverse is also true, and if you can know even a single bit of the plaintext with non-negligible advantage, there is a polynomial time algorithm to extract the key.