WPES10 in real time: Balancing the Shadows
4 October 2010
I am just sitting in the first WPES10 talk:
Balancing the Shadows by Max Schuchard, Alex Dean, Victor Heorhiadi, Yongdae Kim, and Nicholas Hopper (University of Minnesota)
ShadowWalker is a peer-to-peer anonymity system designed by Prateek Mittal (who was our intern in 2008) and Nikita Borisov to prevent corrupt peers jeopardising the network. The authors of this new paper “Balancing the shadows” present an attack on the system, where a malicious coalition of nodes can compromise routing security and can bias the probability of choosing a malicious node as a relay. It turns out that a naïve fix opens the system instead to selective denial-of-service attack.
How does the eclipse attack on ShadowWalker work? The adversary controls a full neighbourhood of the network, i.e. a sequence of peers in the distributed hash table (DHT). This allows an adversary to corrupt the “shadow” mechanism in shadow walker. When Alice asks a malicious node in this neighbourhood about another node in the network, they can provide a false ID, along with a set of false shadows. This attack is not too bad on its own, except that the same mechanism is used during the construction of the routing tables of the DHT. As a result an adversary that controls about 10% of the nodes can corrupt about 90% of the circuits, after a few rounds of the protocol (this was backed by simulations).
How to fix the attack? Can we increase the number of shadows of each node that can testify of the correctness of its ID? It turns out this is not a good idea: the more shadows the higher the probability one of them is malicious. In that case they can maliciously refuse to attest honest nodes, effectively taking them out of the protocol. The authors propose to change the protocol to only require a fraction of shadows providing signatures to attest an ID-node relationship — time will show if this withstands attacks.
What do we learn from this: first the level of security in peer-to-peer anonymity systems is still questionable, as designs keep being proposed and broken on a yearly basis. Second, it highlights that DHT based designs inherit the characteristic that routing tables are designed as part of the protocol. This offers the adversary an opportunity of amplify their attacks. Designs should therefore not consider that the DHT is in an honest steady-state, but instead consider attacks at the time of network formation. Finally, it is worth keeping in mind that these systems try to prevent adversaries using a small fraction of malicious nodes (5%-20%) to compromise the security of a large fraction of the network. This is still far from our hope that peer-to-peer anonymity could withstand large Sybil attacks where the adversary controls a multiple of honest nodes.
Police fishing expeditions in Greece
12 May 2010
An article in the Greek newspaper “Eleftherotypia”on 11 May 2011, covers a worrying trend in surveillance practices of the Greek anti-terrorist squad since 2007. In multiple occasions the police has “lifted” the confidentiality of communications legal provisions, and has requested information about communications taking place within a whole region, for a window of time up to 12 hours. For example:
- On 31-12-2008 they requested data for 3 hours (4am-7am) for a region of Athens, covering the polytechnic school (that is also covered by special privileges / asylum when it comes to freedom of speech) following an attack on a riot police van.
- On 9-3-2009 a large region in Koukaki was targeted for 12 hours. (Map of two first regions: http://s.enet.gr/resources/2010-05/21-thumb-large.jpg)
- Traffic data were collected for whole regions on 12-5-2009, 9-3-2009 (for 90 minutes), 25-11-2009, 18-2-2009 and 7-5-2007.
The article mentions telecommunications (voice and sms) in the first two cases (that might include content), while only mentioning traffic data for the last cases. Furthermore it points out that the selection of time, regions and targets and processing of the information collected happens in an unaccountable manner by officers. The blanket lifting of confidentiality is done under provisions for “state security”, but the article further points out, has now become routine. These practices are also linked with the Data Retention Directive (2006), that has not yet been translated into Greek law, making the legal context for surveillance requests and providers uncertain.
(Original in Greek: “Είμαστε όλοι ύποπτοι…” by ΧΡΗΣΤΟΣ ΖΕΡΒΑΣ)
http://www.enet.gr/?i=issue.el.home&date=11/05/2010&id=160930
This comes as a surprise to me, since I always through that the criteria applied for conducting surveillance have to be tied to a network endpoint, or at least a person’s identity.
Micah Sherr presented at PETS a few days ago his work on ”Scalable Link-Based Relay Selection for Anonymous Routing“. The key idea is that paths are generated by taking into account the network performance of each link to be used. The overhead of distributing performance information can be reduced by associating with each server a network coordinate, that allows to estimate the latency between pairs of nodes.
This is a pure path selection proposal, as quite a few have appeared in the past year to reduce latency, or increase node utilization in Tor. The question with all those proposals is: how much anonymity would these path selection strategies provide?
The methodology we present in “The Bayesian Traffic Analysis of Mix Networks” provides a way of answering such questions, by carefully modelling the path selection strategy. Applying the same methodology to these path selection proposals would be of clear benefit, and an excellent project for anyone interested in understanding better how to apply inference based techniques to traffic analysis.
Today morning at PETS 2009, the paper on “Physical Layer Attacks on Unlinkability in Wireless LANs” was presented. The idea is that despite all anonymization techniques at the logical layers, though IP address modulation, the physical location of the IEEE802.11 transmitters can be localised, and thus unlinkable packets emanating from it linked together.
The approach used to do this, uses signal strength and triangulation techniques, with a machine learning twist, to cluster together emissions and link them to the same transmitter. A set of countermeasures is also presented, where transmitters modulate their signal strength to foil this clustering.
The attacker model was restricted to using commodity hardware, so physical device fingerprinting attacks were not considered.
Bayesian traffic analysis
5 August 2009
In the last year, we have been developping a set of systematic techniques to analyse anonymity systems, to perform traffic analysis. These cast the problem of traffic analysis as a Bayesian inference problem, where the adversay observes some traces, according to a threat model, and then has to infer the hidden state of the system, that is equivalent to tracing who is talking to whom.
So far we have looked at the analysis of mix networks, the analysis of Crowds, and a Bayesian approach to long term intersection attacks. The papers describing each of these are available online:
- Carmela Troncoso and George Danezis.
The Bayesian Traffic Analysis of Mix Networks. (Draft)
ACM CCS 2009, Chicago, USA. - George Danezis, Claudia Diaz, Emilia Kasper, and Carmela Troncoso.
The wisdom of Crowds: attacks and optimal constructions.
ESORICS 2009, St Malo, France. - George Danezis and Carmela Troncoso.
Vida: How to use Bayesian inference to de-anonymize persistent communications.
PETS 2009, Seattle, USA.
A traffic analysis exercise
3 August 2009
In the 1950′s Lambros D. Callimahos built the Zendian Problem [zip] as an integrated exercise in traffic analysis, cryptanalysis, and communications intelligence operations. The state of cryptology today has moved on, beyond the point where an analyst can rely on plaintext to drive operations. The state of traffic analysis techniques, and the availability of more computing power, requires a new generation of exercises to sharpen the tools and minds of those in the field.
Steven Murdoch and myself have been developing, on (and mostly) off, over the past year an exercise in traffic analysis, and in particular the long term disclosure attacks. The exercise was first presented and used at the Brno FIDIS summer school, and we are now using it as part of an industrial training curriculum.
The exercise consists of an anonymized trace of communications, that were mediated by an anonymity system, that a group of people used to message each other. The message traces are synthetic, but generated based on a real-world social network. Users have favorite communication partners, talk more or less according to the type of relationship and time of day, and may reply to each others messages.
The goal is to apply any disclosure attacks, and de-anonymize and trace as many messages as possible. An oracle is provided that outputs the success rate, and the instructor’s pack includes the original messages as well as the scripts used to simulate the messaging behaviours and the anonymization layer. We tried to keep the exercise and success rates realistic, so do not expect to ever get 100% — significantly better than random is already quite good.
The richness of the messaging behaviour is designed to stress the most advanced statistical disclosure techniques, that make use of social network, replies, and perfect matchings. The literature on statistical disclosure can be found on the exercise page, and an example implementing the simple SDA is provided in the bundle. The family of Disclosure Attacks (devised by Kesdoganet al.) might also be modified and applied to the exercise. Our new attack soon to be presented at PET, using Bayesian Inference, could also be applied:
- George Danezis and Carmela Troncoso. Vida: How to use Bayesian inference to de-anonymize persistent communications. Privacy Enhancing Technologies Symposium (PETS 2009), Seattle, USA.
A couple of caveats: this is an exercise, to help people learn about long term traffic analysis attacks, and allow them to implement the attacks on a rich, but safe, dataset. The objective is to learn.
- It is not a benchmarking tool between attacks. We are not sure that the traffic patterns are typical enough to make sure that when an attack performs better in the setting of our exercise it would perform better on real data.
- It is also not a competition or test. We publish all of the hidden state for instructors, and the random number generator used was not cryptographically strong. The point is not who can get the highest score, but the quality of understanding of the attacks.
Caveats aside, I do hope that the exercise opens a discussion about how we can exchange training or live datasets, formats for evaluating traffic analysis attacks, and a level of standardisation to the interfaces of attack scripts. These will probably be topics for debate over the Privacy Enhancing Technologies Symposium 2009, next week.
Both myself and Steven would be very interested to hear your experiences with the exercise, either if you take it yourself, or give it to a class as an instructor. If you extend the exercise, or generate particular bundles of anonymized datasets, we would also be happy to host them.
When traffic analysis leads to torture …
17 April 2009
The ACLU and the BBC have today posted the first memo, dated 1 August 2002, authorising the use of torture by the CIA against Abu Zubaydah, described as “one of the highest ranking members ofAl Qaeda”. Interestingly one of the enablers for passing into an “increased pressure phase” (you have to love these euphemisms) comes down to traffic analysis, as this passage suggests:

According to the document “intelligence indicates that there is currently a level of `chatter’ equal to that which preceded the September 11 attacks”. It is not comforting at all to know that such automatic processing, as well as subjective interpretation, can be used to start torturing people, in the absence of any other concrete evidence.
Update: Steven Murdoch points to the Washington Post article clarifying the role of the Abu Zubaida as being nowhere near as important as initially assumed. The article states that “Abu Zubaida was not even an official member of al-Qaeda”. Worth reading in its entirety.
The “Traffic Protection” session at NDSS 2009
10 February 2009
It is quite interesting that this year’s NDSS, has a special session on “Traffic Protection”. It contains two papers, one about attack (or stepping stone detection) and one on defense (or traffic analysis resistance).
The first paper from Anmir Houmansadr, Negar Kiyavash and Nikita Borisov proposes an active watermarking scheme for network flows, based on spread spectrum techniques, called RAINBOW.It seems like solid work, particularly when it comes to detectability. The authors use a statistical test to determine the covertness of the scheme, that might actually not be optimal for detection. I foresee that covertness would be the property to look at in order to break the scheme or improve on it. The full reference is:
The second paper (presented as a write) is about Traffic Morphing, i.e. how to make encrypted traffic meta-data look like traffic of another class. Unlike anonymity solutions the aim is not to make all traffic look the same, but instead to fool a classifier. This is an interesting approach, but may open up an arms a race between traffic analysis resistance solutions, and those who build better and better classifiers. The full reference is:
- Traffic Morphing: An Efficient Defense Against Statistical Traffic Analysis. Charles Wright, MIT Lincoln Laboratory; Scott Coull, Johns HopkinsUniversity; Fabian Monrose, University of North Carolina, NDSS, February 2009.
(No pdf is yet available for the second work.)
In real time: Coordinated Scan Detection
10 February 2009
I am currently at NDSS 2009, to present our recent work with Prateek Mittal on SybilInfer [pdf], an inference engine to detect sybil attacks in social networks. Interestingly Carrie Gates is also presenting (right now) a traffic analysis paper on detecting coordinated scans. It would be greatly improved if cast in an inference framework but the techniques and assumptions are still quite interesting.
Coordinated Scan Detection
Carrie Gates, CA Labs
Coordinated attacks distribute the tasks involved in an attack amongst multiple sources. We present a detection algorithm that is based on an adversary model of desired information gain and employs heuristics similar to those for solving the set covering problem. A detector is developed and tested against coordinated horizontal and strobe scanning activity. Experimental results demonstrate an acceptably low false positive rate, and we discuss the conditions required to maximize the detection rate.
Strangely I cannot find a copy of it on-line…
I am currently at ACM CCS 2008 listening to the talk on “Dependent link padding algorithms for low latency anonymity systems” by Wei Wang, Mehul Motani and Vikram Srinivasan (the pdf does not seem to be on-line yet). They propose a scheme to provably defeat all packet matching attacks against low-latency anonymity systems, by introducing the minimal amount of cover traffic. The results are theoretically well founded, and of great practical importance since they show how one could provide strong anonymity without “constant rate” padding (as it is often assumed necessary.)