For a long time I have been sceptical about Private Information Retrieval (PIR) schemes and security schemes based on them. My first experience of PIR was in the single server setting, where communication and computation complexity makes them impractical. Re-reading the The Pynchon Gate I realized that multi-server PIR systems are computationally cheap, bandwidth efficient and relatively simple to implement.

The ‘only’ downside of multi-server PIR is that they are subject to compulsion attacks. A powerful adversary can force servers, after a query, to reveal the client queries, and can infer which document was retrieved. This is an inherent limitation of using a collection of trusted parties, so it is difficult to eliminate. On the other hand a system can make the task of the attacker much more expensive and difficult, though the use of forward security mechanisms.

Here is a proposal for achieving forward-secure compulsion-resistant multi-server PIR: the user contacts the servers one by one, using an encryption channel providing forward secrecy (OTR would work; so would SSL using signed ephemeral DH.) After the result of the query is returned, the server securely deletes all information about the query, and forgets the session keys associated with the channel. At this point an adversary will never be able to retrieve any information about the query or the result, even if they get access to all the secrets on the server.

The user can then proceed to perform the same protocol sequentially with all the other servers participating in the PIR scheme. After sessions with each server close, the user is guaranteed that the query information will never be retrieved in the future. A single honest server, willing to provide strong guarantees against compulsion, is sufficient to guarantee this property, even if all the others log requests and are ready to hand them over to the adversary.

Furthermore the sequential nature of the requests allow a client to terminate the query early, if there is any suspicion that one or more servers act under compulsion. This could be detected through a covert channel, a change of key, or unavailability. This technique is a further argument for operators to terminate their services instead of giving in to compulsion.

The list of accepted papers for the Privacy Enhancing Technologies Symposium for 2008 (PET) has just been published. The program this year is very heavy on anonymous communication, and traffic analysis, cementing the symposiumas the de-facto venue in these disciplines.

A selection of papers to appear, that make a very interesting read  (thanks to the authors giving me a copy!), include:

  • Studying Timing Analysis on the Internet with SubRosa
    Hatim Daginawala and Matthew Wright (University of Texas at Arlington, USA)
  • Perfect Matching Disclosure Attacks
    Carmela Troncoso, Benedikt Gierlichs, Bart Preneel, and Ingrid Verbauwhede (COSIC, K.U. Leuven, Belgium)
  • On the Impact of Social Network Profiling on Anonymity
    Claudia Diaz, Carmela Troncoso (K.U.Leuven, Belgium), and Andrei Serjantov (The Free Haven Project, UK)
  • Chattering Laptops
    Tuomas Aura (Microsoft Research, UK), Janne Lindqvist (Helsinki University of Technology, Finland), Michael Roe (Microsoft Research, UK), and Anish Mohammed (Royal Holloway, University of London, UK)
  • Metrics for Security and Performance in Low-Latency Anonymity Systems
    Steven J. Murdoch and Robert N. M. Watson (Computer Laboratory, University of Cambridge, UK)

I should blog in detail about our own contributions, when the final drafts are in, so that I can provide a link to the full text:

  • Bridging and Fingerprinting: Epistemic Attacks on Route Selection
    George Danezis (Microsoft Research, Cambridge, UK) and Paul Syverson (Naval Research Laboratory, USA)
  • How to Bypass Two Anonymity Revocation Schemes
    George Danezis (Microsoft Research, Cambridge, UK) and Len Sassaman (K.U. Leuven, Belgium)

Lets not forget that the PET Award for 2008 deadline is still ahead of us, and good papers from the last two years deserve to be nominated for it!

I was kindly invited by Peeter Laud and Jan Villemson to teach part of the Tartu University course on cryptographic protocols. The theme chosen for the 6 hours of lectures was “Identity and Anonymity“.

 It started with an introduction to authentication protocols, illustarted by the state-of-the-art Just Fast Keying, an authenticated Diffie-Hellaman with DoS protection and privacy, as well as the PAK protocol for authenticated key exchange using a short password (like a PIN). The second 1.5 hours of the lecture series went into the technical details of Brands anonymous credentials, building from simple zero-knowledge protocols, such as Schnorr identification, all the way to how to prove a that a set of linear relation holds on some attributes.

  • Lecture notes on authentication protocols and anonymous credentials (PDF, PPTX)

The second half of the lecture series (3 hours) was devoted to anonymous communications — my pet subject. It contained a review of the Dining Cryptographers protocol for unconditional anonymity, as well as the basic mix constructions to build practical systems. It also described how onion routing works as well as how to measure anonymity and the basics of disclosure attacks on repeated communications.

  • Lecture notes on anonymous communications and traffic analysis (PDF, PPTX)

I just finished reading:

Anonymous Networking amidst Eavesdroppers (PDF)
by Parvathinathan Venkitasubramaniam, Ting He, and Lang Tong.
Pre-print available as arXiv:0710.4903v1 at arxiv.org, October 2007.

This piece of work has the potential to become a milestone is anonymity research. It puts forward, for the first time, an analytical model of the anonymous communication capacity for a certain degree of quality of service, in terms of latency and throughput.

The key idea behind the anonymity mechanism is that relays transmit a schedule that is statistically independent from the messages they receive. This ensure that no traffic characteristics propagate in the network to allow tracing. When it is time to send a message out, a node checks if an input message is in the queue, otherwise it sends a dummy. Input messages in the internal queues expire after a certain time to ensure that message are delivered in a timely fashion (introducing unreliability, and thus a reduction in capacity, instead.)

The measure of anonymity used is entropic (but different from previous ones), in that it measure the fraction of information gain. Say the adversary has H(S) bits of uncertainty before the observation of the system, and H(S|Y) afterwards, the metric of security is defines as a = H(S|Y) / H(S). It is cute that Fano’s inequality allows to calculate the probability the adversary identifies the correct set of communicating partners from this metric.

Excellent, highly recommended, work!

I just read a nice paper entitled “P2P: Is Big Brother Watching You?” by Banerjee, Faloutsos and Bhuyan at at UC Riverside. They present experiments to determine the probability a P2P file sharer stumbles upon an IP address thought to be used by anti-P2P entities, potentially to launch law-suits. Interestingly without the use of Black lists the probability is very close to 100% while even simple filtering brings it down to 1%.

That is of interest to the traffic analyst is the techniques used for both surveillance and counter surveillance. Infiltration of the P2P network can be thought as an active, sybil attack, and could potentially be facilitated through the identification of nodes with higher degree or other structural properties. This seems to not be the case, and P2P super-nodes turn out to have the same probability of being visited as normal peers.

The second point of interest is the use of anonymity and identification from all parties. Peers use blacklists of IP ranges in order to detect potential organizations that run surveillance operations. Clearly it would have been of some benefit for those performing surveillance to have access to communication systems that hide their network attachment points. On the other hand they do try to make it harder to link IP ranges to real-world identities by locating machines, and routing from and to, part of the unassigned IP space. As a result WHOIS lookups do not yield any information about the real world entity behind the surveillance operations.

I have been reading some papers suggested by Janne Lindqvist on the subject of device identification, an often neglected aspect of traffic analysis.

Remote Physical Device Fingerprinting by Kohno et al. introduced the clock skew technique for uniquely identifying networked devices that support the TCP timestamp extensions. The key idea is that the clock skey of each device’s clock is unique, and observing the time drift over a long period allows an adversary to identify it. The technique has been extended by Steven Murdoch in his paper Hot or Not: Revealing Hidden Services by their Clock Skew where he uses it to identify Tor hidden services.

For simpler devices, that may not have a TCP/IP stack, identification can be done at an even lower level. In their work entitled Implications of Radio Fingerprinting on the Security of Sensor Networks, Ramussen and Capkun discuss how the shap of the radio transmission signal can be used to uniquely identify many sensor Cipcon 1000 nodes. For their identification they use features of the radio transient, which is the window of signal between no transmission and a transmission start. By measuring the length, number of peaks, variance of amplitude, and the first wavelet coefficients they manage to identificy the majority (70%) of nodes.

A weakness of the work is the suggestion to use these features to strengthn identification. While this might work in the short term (a similar technique was used to detect first generation cloned mobile phones) in the long run a strategic adversary should have no trouble faking the transient’s signatures.

Finally Corbert et al. present a method to detect the manufacturer of a IEEE 802.11 network device in A Passive Approach to Wireless NIC Identification. The key observation is that the standard supports multipe distinct rates of transmission (of 1, 2, 5.5 and 11 Mbps) and the switching algorithm between the different rates is not standardized. By infering it a good guess can be made as to the vendor of the NIC. Furthermore no interception is required at the physical layer, since the information about rates of transfer is transmitted in clear. A similar approach is advocated in Passive Data Link Layer 802.11 Wireless Device Driver Fingerprinting by Franklin et al.

Just finished reading a new paper on Low Cost Traffic Analysis: 

Wiangsripanawan, R., Susilo, W., and Safavi-Naini, R. 2007. Design principles for low latency anonymous network systems secure against timing attacks. In Proceedings of the Fifth Australasian Symposium on ACSW Frontiers – Volume 68(Ballarat, Australia, January 30 – February 02, 2007).

The authors look afresh at the Low Cost Traffic Analysis attack and how it applies to the Tarzan and MorphMix peer-to-peer anonymity systems. The key observation is that for the attack to apply three preconditions need to hold:

  1. A node’s load affects the latency of relayed traffic.
  2. The adversary knows the nodes participating in the protocols.
  3. The adversary must be able to establish a direct connection with all other nodes.

The paper argues that Tarzan’s mimic based routing structure may invalidate precondition (3). MorphMix on the other hand makes it difficult for the adversary to know all nodes in the network (2). As a general rule they advise designers to make comprehensive node discovery difficult, a property that is also in line with the needs of censorship resistant proposals.

Before getting overly excited about the security of those systems it is worth remembering the inherent unscalability of Tarzan, as well as the recent attacks against MorphMix.

For many years the canonical reference for research in anonymity and anonymous communications has been the Freehaven anonymity bibliography. Claudia Diaz and George Danezis have now written a short introduction to this research field called “A survey of anonymous communication channels“.

The survey includes a discussion of definitions, metrics for anonymity, and then describes systems in order of strength. Th focus is on mixing mechanisms and onion routing, but remailers, as well as provable shuffles are explained.

Follow

Get every new post delivered to your Inbox.