I just sat thought the first session of PET2009, that was about privacy policies where two really interesting pieces of research were presented.

Ram presented a work on “Capturing Social Networking Privacy Preferences” [pdf], where he proposes to infer automatically privacy policies for social networks, and present them as templates or starting points for users to define their own policies. The methodology used is really neat: they record the location of a number of users, and every night they ask the users whether they would be happy to share their locations with different circles of theirs. Then they try to extract a set of standard policies, based on time, location, and the type of contact that can see your location.

The second study, presented by Aleecia, is on how easy and pleasing is to read privacy policies (“A Comparative Study of Online Privacy Policies and Formats“). They find that privacy policies in different formats are more or less easy to read and understand, but across the board privacy policies are difficult to understand, easy to misunderstand, and totally unpleasant to read.

I just come back from a visit to COSIC at K.U. Leuven, to teach a course on Computer Security. Claudia Diaz and myself discussed over lunch the idea of putting together a syllabus for Privacy Technologies. Many in this field have been teaching courses and giving guest lectures, but there does not seem to be yet a canonical curriculum, describing that an advanced course in Privacy Technology should teach.

Here is my attempt at proposing such a syllabus — which I will probably revise after discussions at PETS 2009 next week.

  1. An introduction to Privacy Technology
    An overview of the basic concepts, different fields like technology and law, motivation, threat models, Soft versus Hard privacy technology.
    Slides from the 2007 COSIC course: Introduction to Privacy Technology [pdf]
    (Claudia Diaz has vastly improved these slides to present a lecture on the same topic in this years COSIC course.)
  2. Privacy in authentication
    Modern authentication protocols, initiator privacy and responder privacy, JFKi and JFKr examples, secure password authentication, PAK.
    Slides from Estonia computer security course in 2007: Secure authentication[pdf, start at slide 3]
  3. Selective Disclosure Credentials
    Zero knowledge proofs, selective disclosure for discrete logs, Brands credentials, CL signatures and CL credentials, e-cash, abuse prevention.
    Slides from Estonia computer security course in 2007: Anonymous credentials[pdf, start at slide 45]
  4. Anonymous communications
    Proxies, Crowds, DC networks, mix networks and onion routing.
    Slides from ITE talk in 2006: Introducing Anonymous Communications[pdf]
  5. An introduction to traffic analysis
    History, identification, information extraction, military applications, internet security, traffic data availability
    Slides from SantaCrypt 2005 in Prague: An introduction to traffic analysis[pdf]
  6. The traffic analysis of anonymous communications
    Cryptographic attacks, long term intersection and disclosure, short term disclosure, bridging, network discovery.
    Slides from Umass Amherst talk: Introduction to traffic analysis[pdf]
  7. Privacy in Databases
    Inference control, k-anonymity, differential privacy, perturbation, trackers.
    Slides from CFP 2007: Privacy in Databases[pdf, start at slide 58]
  8. Privacy in Storage
    Encrypted storage, steganographic storage, remote storage, traffic analysis of storage protocols.
  9. Secure Elections
    Electronic voting technologies, secure crypto elections, manual zero-knowledge proofs, receipt freeness, robust mixing.
  10. Censorship resistance and availability
    Blocking technologies, counter-blocking technologies, RF technologies, peer-to-peer file sharing, decentralisation and reputation technologies, sybil attacks.
  11. Location privacy
    Location based services, Mix zones, ad-hoc network privacy, location privacy friendly location services (PriPAYD), charging schemes.
  12. Identity management protocols
    Federated identity management, Liberty, InfoCards, OpenID, PRIME project concepts, privacy policies, P3p, SecPAL.
  13. Economic, legal and policy issues of Privacy Technology
    Privacy economics & attitudes, data protection, data retention, interception by design, lawful access, coercion, privacy as a right, health information.

Note that the order of the topics is arbitrary, and mostly related to what slides I have already available. One could start with less technical subjects and then go to the more cryptographic and statistical topics. If anyone has any nice pointers to slide decks for the topics that have none for the moment, I would appriciate them.

Lords recommend PETs

6 February 2009

The house of Lords Constitution Committeehas just published a report on Surveillance: Citizens and the State as well as the evidence they heard. As part of their recommendations they push Privacy enhancing Technologies to be part of the procurement process of government projects. In particular they say:

485. We recommend that the Government review their procurement processes so as to incorporate design solutions that include privacy-enhancing technologies in new or planned data gathering and processing systems. (paragraph 349)

They also push, albeit in an indirect way, for privacy enhanced identification schemes and ID cards, citing the example of Austria. This is basically a recommendation to implement selective disclosure credential technologies:

478. We recommend that the Government’s development of identification systems should give priority to citizen-oriented considerations. (paragraph 268)

Which refers to:

268. The Information Commissioner’s Office (ICO) drew attention to the use in Austria of a system of identification numbers that allows access to information in different databases “without the need for a single widely known personal identification number that may be misused.” (p 5) The Royal Academy of Engineering (RAE) explained that it is possible for individuals to fulfil their legitimate need or desire to maintain multiple roles or identities in transactions with state or other organisations and to avoid the possibility of those organisations needlessly correlating them. The technology involved in identification can be developed to suit an individual’s preference to keep domestic status and work life separate, where the protection of identity is necessary to avoid abusive relationships or stalking, or where witnesses and children need protection.118 We recommend that the Government’s development of identification systems should give priority to citizen-oriented considerations.

This is all good news! It is indeed at the procurement phase that such requirements for PETs should be specified and entrenched in the delivery contracts. Negotiating PETs for complex surveillance technologies will also make the cost of recording data just-in-case visible.

Carmela Troncoso points to an article in ”Telematics Update” about Norwich Union (NU) discontinuing their pilot pay-as-you-drive scheme. Despite them being rather coy about the exact reasons, some experts guess:

“Strategy Analytics analyst, Clare Hughes, said that prohibitive launch costs, privacy violations, patent fees, back-office data integration and difficulties in measuring costs versus benefits would inhibit the immediate widespread launch of PAYD schemes.”

There are two items of interest in this list, one obvious and one less so. The obvious one concerns privacy violations, and the perception from drivers that the insurance company and anyone who can get hold of the data can spy on their every movement. This fear is to some extent justified, which led us to propose PriPAYD, to alleviate exactly those privacy concerns. It is good to see that once more it is proved that privacy technology is an enabler, sadly the hard way for NU.

More subtlety the “back-office data integration” also relates to privacy. Requiring every data point to be transmitted to a central server, and building a gigantic silo of location data comes at a price. Processing this information to extract billing data, or even worse securing and managing access to it is an expensive business. If only PAYD providers stick to their core business model, i.e. provide insurance by the mile, type of road and time of day, they could get rid of data as soon as it is processed, reducing the costs of storage, further processing and management.

Interestingly the article points out that the future of PAYD rests in the services area: roadside assistance, emergency help, etc. This points towards an integration of the PAYD box with other components of a car, making it in fact part of a more general computing platform. Again, lets hope that this platform will be user centric, and will not be emitting a location trail to third parties.

For a long time I have been sceptical about Private Information Retrieval (PIR) schemes and security schemes based on them. My first experience of PIR was in the single server setting, where communication and computation complexity makes them impractical. Re-reading the The Pynchon Gate I realized that multi-server PIR systems are computationally cheap, bandwidth efficient and relatively simple to implement.

The ‘only’ downside of multi-server PIR is that they are subject to compulsion attacks. A powerful adversary can force servers, after a query, to reveal the client queries, and can infer which document was retrieved. This is an inherent limitation of using a collection of trusted parties, so it is difficult to eliminate. On the other hand a system can make the task of the attacker much more expensive and difficult, though the use of forward security mechanisms.

Here is a proposal for achieving forward-secure compulsion-resistant multi-server PIR: the user contacts the servers one by one, using an encryption channel providing forward secrecy (OTR would work; so would SSL using signed ephemeral DH.) After the result of the query is returned, the server securely deletes all information about the query, and forgets the session keys associated with the channel. At this point an adversary will never be able to retrieve any information about the query or the result, even if they get access to all the secrets on the server.

The user can then proceed to perform the same protocol sequentially with all the other servers participating in the PIR scheme. After sessions with each server close, the user is guaranteed that the query information will never be retrieved in the future. A single honest server, willing to provide strong guarantees against compulsion, is sufficient to guarantee this property, even if all the others log requests and are ready to hand them over to the adversary.

Furthermore the sequential nature of the requests allow a client to terminate the query early, if there is any suspicion that one or more servers act under compulsion. This could be detected through a covert channel, a change of key, or unavailability. This technique is a further argument for operators to terminate their services instead of giving in to compulsion.

The previous post, pointing fingers at papers making unreasonable assumptions about Distributed Hash Tables, created a bit of controversy in the Cambridge Security group mailing lists. One of the most valid comments (made by Steven Murdoch over and Indian meal) was that no one knows how to secure systems against Sybil  attacks. A few years ago Chris Lesniewski-Laas, M. Frans Kaashoek, Ross Anderson and myself worked on using social links between peers to aleviate the sybil attack in DHTs. The proposed solution was cumbersome, but the idea was clearly worth pursuing.

 I am most glad to see that Chris Lesniewski-Laas has worked further on these ideas, and will be presenting a paper on the topic very soon:

A Sybil-proof one-hop DHT.
Chris Lesniewski-Laas. To appear. In Proceedings of the Workshop on Social Network Systems, Glasgow, Scotland, April 2008.

Abstract

“Decentralized systems, such as structured overlays, are subject to the Sybil attack, in which an adversary creates many false identities to increase its influence. This paper describes a one-hop distributed hash table which uses the social links between users to strongly resist the Sybil attack. The social network is assumed to be fast mixing, meaning that a random walk in the honest part of the network quickly approaches the uniform distribution. As in the related SybilLimit system, with a social network of n honest nodes and m honest edges, the protocol can tolerate up to o(n/log n) attack edges (social links from honest nodes to compromised nodes). The routing tables contain O(√m log m) entries per node and are constructed efficiently by a distributed protocol. This is the first sublinear solution to this problem. Preliminary simulation results are presented to demonstrate the approach’s effectiveness.”

The recent leak of the existence of malware that the Germal police might be using to intercept encrypted skype and SSLcalls has already made quite a bit of noise. It clearly suggests that the surveillance battlefield is shifting from the network to the end-host, where information and keys can be found in clear. Yet an interesting issue that is discussed in the letters relates to getting hold of the intercept material: in the old days it was gathered from the network – in the proposed architecture it is sitting on the victims’ machines.

Interestingly the proposed way of getting hold of the plaintext is through the use of an anonymous proxy! This ensures that even if a data-flow is detected by the victim, the agent doing the wiretap or the agency is not detected. Further good advice includes using a relay in a different jurisdiction to make tracing even harder, and add further uncertainty about the originator of the attack.

It is fascinating to see how, once again, law enforcement finds traffic analysis resistant communications key to their operational success. Already established standards for interception interfaces (from ETSI) stipulate that the delivery of intercept material has to be unobservable to anyone not authorised, even within the telco provider. The federal Trojan architecture pushes this even further, by requiring the malware to leak information in an unobservable manner.

The arms race is only starting…

Claudia Diaz just forwarded an email by Eric Rescorla pointing to an article in Wired describing how the FBI has been gaining access to telephone traffic data without a warrant. A saucy exerpt:

The revelation is the second this year showing that FBI employees bypassed court order requirements for phone records. In July, the FBI and the Justice Department Inspector General revealed the existence of a joint investigation into an FBI counter-terrorism office, after an audit found that the Communications Analysis Unit sent more than 700 fake emergency letters to phone companies seeking call records. An Inspector General spokeswoman declined to provide the status of that investigation, citing agency policy.

[...]

The message was sent to an employee in the FBI’s Operational Technology Division by a technical surveillance specialist at the FBI’s Minneapolis field office — both names were redacted from the documents. The e-mail describes widespread attempts to bypass court order requirements for cellphone data in the Minneapolis office.

Remarkably, when the technical agent began refusing to cooperate, other agents began calling telephone carriers directly, posing as the technical agent to get customer cellphone records.

The interesting point here is how agents seemed to have been abusing the lawful access process, by pretending to be a colleague with legal authority, in order to get out of phone companies either records of calls and locations of phone, or surveillace equipment to be turned on. A similar scandal had broken out in Chicago back in 2006 when it became known that insiders in phone companies have been selling phone records to the FBI as well as private entities: the police was then concenrned that such information may be used by the mob to out informants.

The people from wikileaks have uncovered the inventory of equipment the US uses in Afghanistan and Iraq. Part of the surprise was the wide spread use, and dominant cost, of Warlock Green and Warlock Red Jammers (a nice presentation on how they work.) The presentation gives a strong hint that the use of jammers against IEDs is limited, particularly since a device could be programmed to detect the jamming signal and use it as a range finder to control detonation. 

There also seem to be about 44 TACLANE KG-175 E100AC routers deployed in Afghanistan only, that overlay cryptographic protection, and presumably traffic analysis prevention on public IP and ATM networks. Each of those boxes comes with a $10K price tag, proving once more that selling security pays well.