When traffic analysis leads to torture …
17 April 2009
The ACLU and the BBC have today posted the first memo, dated 1 August 2002, authorising the use of torture by the CIA against Abu Zubaydah, described as “one of the highest ranking members ofAl Qaeda”. Interestingly one of the enablers for passing into an “increased pressure phase” (you have to love these euphemisms) comes down to traffic analysis, as this passage suggests:

According to the document “intelligence indicates that there is currently a level of `chatter’ equal to that which preceded the September 11 attacks”. It is not comforting at all to know that such automatic processing, as well as subjective interpretation, can be used to start torturing people, in the absence of any other concrete evidence.
Update: Steven Murdoch points to the Washington Post article clarifying the role of the Abu Zubaida as being nowhere near as important as initially assumed. The article states that “Abu Zubaida was not even an official member of al-Qaeda”. Worth reading in its entirety.
Mass political surveillance in the UK is alive and well!
19 March 2009
There is a tendency amongst privacy advocates in the UK to focus on mistakes, or false positives, of ubiquitous surveillance, as well as small scale “disproportionate” uses of surveillance. These two are the key arguments used to fend off plans to increase the level of data collection.
In the first case the argument is that perfectly honest people might be mistaken for crooks because of the imperfect view that any data collection system provides the authorities. Any automated decisions, the argument goes, will inevitably flag up Innocent people, while miss the sought targets, since they will be using an array of evasion tactics to foil it. In its essence, this first criticism is true, but can easily be countered by a good oversight mechanism, including human judgement in the loop, as well as pointing out that the bad guys will never have perfect discipline in implementing counter surveillance measures, and if they do it will be at a great cost. Needless to say the false positive / false negative argument has not been very successful, even though it is a good one.
The second argument is based on proportionality: once surveillance powers are in place for one purpose, such as the prevention of serious crime or terrorism, they will inevitably be used for other unforeseen and disproportionate aims. The key recent example is how local UK authorities are using directed surveilance powers to prevent littering and dog fouling. Similar fears have been expressed about traffic data retention that could be used as part of civil cases, or simply seized for any crime what so ever using established evidence collection laws. Again, this argument is valid but a good oversignt mechanism can take care of those cases, at least in theory.
The reason these arguments are first to be used, as well as ineffective, is that they start from the premise that institutionally those performing the surveillance are “the good guys”, and their aim is to catch “the bad guys” to protect the public. Sure, in the process mistakes happen, but they are in good faith and are rectified since all the good people are on the same side after all. “Bad apples” misusing their surveillance powers will be weeded out, since institutionally the context in which they use these powers is benevolent, and devoid of malice. On can easily see why privacy advocates in the UK have found it easy to use this assumption, since they mostly lobby politicians and have a close relationship with law enforcement as well as industry, who while admitting isolated mistakes will never admit a systematic privacy problem, let alone systematic malicious use of surveillance powers.
The tide is turning on this argument. In the recent months we have witnessed direct interference with the elected political process by the police, namely the raid on the Parliament office of MP Damian Green. As The Register reports “Green’s homes and offices were searched on 27 November following his arrest, on suspicion of leaking embarrassing informationfrom the Home Office.” The information was simply politically embarrassing, not sensitive or national security related. It seem this incident has challenged in the mainstream that those in charge of surveillance will simply act in the public interest, and other cases of mass political surveillance have since seen the light:
- First a company named The Consulting Association was found to keep an extensive database about construction workers, listing their trade union activity, past disputes with employers, and other sensitive personal information. It was providing a vetting service to the building industry to ensure that those active in the labour movement, basically do not get jobs.
- Secondly a Guardian investigation uncovered that the Metropolitan Police keeps a database of people attending protests, despite them never have been in trouble with the law, and specifically targets journalists covering protests. (The video is highly recommended.)
These are no more isolated abuses, but systematic operations running for many years, and supported at the highest level of management of both organizations. In its editorial the Guardianput its finger on the key argument against surveillance powers by finally saying out loud: “today’s revelations underline the perils surveillance represent for democracy [...]“. These worries are now being echoed at the highest echelons of the political system, as The Register reports regarding the Policing complaints at the recent Climate Camp:
“The problem with incidents of this kind, according to Norman Baker MP, who addressed the meeting on the Climate Camp protest yesterday is that they look suspiciously like police-made law and go hand in hand with the politicisation of the police. He said: “The IPCC exist to investigate allegations of individual misconduct by Police Officers. They are not there to investigate systemic abuses of power, which is what seem to be going on in cases such as the Climate Camp.”
“I am a strong supporter of the Police. But there looks increasingly to be a need for additional oversight into the ways in which they interpret the law.”
The “Traffic Protection” session at NDSS 2009
10 February 2009
It is quite interesting that this year’s NDSS, has a special session on “Traffic Protection”. It contains two papers, one about attack (or stepping stone detection) and one on defense (or traffic analysis resistance).
The first paper from Anmir Houmansadr, Negar Kiyavash and Nikita Borisov proposes an active watermarking scheme for network flows, based on spread spectrum techniques, called RAINBOW.It seems like solid work, particularly when it comes to detectability. The authors use a statistical test to determine the covertness of the scheme, that might actually not be optimal for detection. I foresee that covertness would be the property to look at in order to break the scheme or improve on it. The full reference is:
The second paper (presented as a write) is about Traffic Morphing, i.e. how to make encrypted traffic meta-data look like traffic of another class. Unlike anonymity solutions the aim is not to make all traffic look the same, but instead to fool a classifier. This is an interesting approach, but may open up an arms a race between traffic analysis resistance solutions, and those who build better and better classifiers. The full reference is:
- Traffic Morphing: An Efficient Defense Against Statistical Traffic Analysis. Charles Wright, MIT Lincoln Laboratory; Scott Coull, Johns HopkinsUniversity; Fabian Monrose, University of North Carolina, NDSS, February 2009.
(No pdf is yet available for the second work.)
In real time: Coordinated Scan Detection
10 February 2009
I am currently at NDSS 2009, to present our recent work with Prateek Mittal on SybilInfer [pdf], an inference engine to detect sybil attacks in social networks. Interestingly Carrie Gates is also presenting (right now) a traffic analysis paper on detecting coordinated scans. It would be greatly improved if cast in an inference framework but the techniques and assumptions are still quite interesting.
Coordinated Scan Detection
Carrie Gates, CA Labs
Coordinated attacks distribute the tasks involved in an attack amongst multiple sources. We present a detection algorithm that is based on an adversary model of desired information gain and employs heuristics similar to those for solving the set covering problem. A detector is developed and tested against coordinated horizontal and strobe scanning activity. Experimental results demonstrate an acceptably low false positive rate, and we discuss the conditions required to maximize the detection rate.
Strangely I cannot find a copy of it on-line…
Lords recommend PETs
6 February 2009
The house of Lords Constitution Committeehas just published a report on Surveillance: Citizens and the State as well as the evidence they heard. As part of their recommendations they push Privacy enhancing Technologies to be part of the procurement process of government projects. In particular they say:
“485. We recommend that the Government review their procurement processes so as to incorporate design solutions that include privacy-enhancing technologies in new or planned data gathering and processing systems. (paragraph 349)“
They also push, albeit in an indirect way, for privacy enhanced identification schemes and ID cards, citing the example of Austria. This is basically a recommendation to implement selective disclosure credential technologies:
“478. We recommend that the Government’s development of identification systems should give priority to citizen-oriented considerations. (paragraph 268)“
Which refers to:
“268. The Information Commissioner’s Office (ICO) drew attention to the use in Austria of a system of identification numbers that allows access to information in different databases “without the need for a single widely known personal identification number that may be misused.” (p 5) The Royal Academy of Engineering (RAE) explained that it is possible for individuals to fulfil their legitimate need or desire to maintain multiple roles or identities in transactions with state or other organisations and to avoid the possibility of those organisations needlessly correlating them. The technology involved in identification can be developed to suit an individual’s preference to keep domestic status and work life separate, where the protection of identity is necessary to avoid abusive relationships or stalking, or where witnesses and children need protection.118 We recommend that the Government’s development of identification systems should give priority to citizen-oriented considerations.“
This is all good news! It is indeed at the procurement phase that such requirements for PETs should be specified and entrenched in the delivery contracts. Negotiating PETs for complex surveillance technologies will also make the cost of recording data just-in-case visible.
I am currently at ACM CCS 2008 listening to the talk on “Dependent link padding algorithms for low latency anonymity systems” by Wei Wang, Mehul Motani and Vikram Srinivasan (the pdf does not seem to be on-line yet). They propose a scheme to provably defeat all packet matching attacks against low-latency anonymity systems, by introducing the minimal amount of cover traffic. The results are theoretically well founded, and of great practical importance since they show how one could provide strong anonymity without “constant rate” padding (as it is often assumed necessary.)
Shishir Nagaraja has posted on his research web-site his latest work on “The Economics of Covert Community Detection and Hiding“. This extends the line of research myself and Bettina Wittneben started with our paper “The Economics of Mass Surveillance and the Questionable Value of Anonymous Communications“, where we showed that anonymous communications themselves are not preventing target selection. Shishir’s work shows that simple covertness strategies can instead make the job of the surveillance analyst much harder.
The full abstract reads:
“We present a model of surveillance based on the detection of community structure in social networks. We examine the extent of network topology information an adver sary is required to gather in order to obtain high quality intelligence about community membership. We show that selective surveillance strategies can improve the adversary’s resource efficiency. However, the use of counter-surveillance defence strategies can signifficantly reduce the adversary’s capability. We analyze two adversary models drawn from contemporary computer security literature, and explore the dynamics of community detection and hiding in these settings. Our results show that in the absence of counter-surveillance moves, placing a mere 8% of the network under surveillance can uncover the community membership of as much as 50% of the network. Uncovering all community information with targeted selection requires half the surveillance budget where parties use anonymous channels to communicate. Finally, the most determined covert community can escape detection by adopting decentralized counter-surveillance techniques even while facing an adversary with full topology knowledge – by investing in a small counter-surveillance budget, a rebel group can induce a steep increase in the false negative ratio.”
Sneak preview: “Entropy Bounds for Traffic Confirmation”
13 October 2008
Luke O’Connor has uploaded on the cryptology eprint archive a manuscript providing analytical bounds for the Hitting Set Attack. The paper entitled “Entropy Bounds for Traffic Confirmation” [PDF] demonstrates that after O(m log N) messages from Alice (where N is the number of all receivers and m the number of friends of Alice) the hitting set attack failure probability becomes negligible.
Highly recomended reading!
Where is European computer security research?
12 October 2008
There is something very broken with computer security research in Europe. While EU funding is pouring in for many years through successive FPs, it seems that European research groups and institutions are systematically underrepresented in terms of Program Committee participation to the top-tier conferences. (Individual researchers of European origin, based abroad, are actually doing quite fine.)
The following graph illustrates the fraction of European researchers in some top-tier computer security conferences over the past decade. Core security conferences are chosen, namely IEEE S&P, ACM CCS, ISOC NDSS and USENIX SEC, as compared with more crypto conferences like CRYPTO, or EUROCRYPT (where European research seems to be quite competitive.) As we can see on average this fraction is less than 20%, with some venues like USENIX SEC and NDSS often figuring next to no European researcher on their PC.
Even this graph says only half the story. Within Europe there is a tremendous variability in PC membership of these conferences, with few individual researchers from specific groups being invited repeatedly. One example is illustrative: the IEEE S&P committee for 2009 is composed of 48 members; 8 of them from Europe; 4 of them from Cambridge; 3 of them from Microsoft Research, Cambridge (a US company, by the way.)
What is going on? Systematic bias in the chair’s selection (unlikely), or a structural problem in the European security research field (much more likely)?
The base rate fallacy and the traffic analysis of Tor
30 September 2008
An anonymous contributor, The23rd Raccoon, sent a few days a go a very insightful post to the or-dev lists entitled “How I Learned to Stop Ph34ring NSA and Love the Base Rate Fallacy“. The key point is that tracing anonymous communications is an identification exercise: the adversary has to detect the single correct target amongst the noise of incorrect identities. Therefore reporting simply false positives and false negative rates is misleading, since even moderate false positives will lead to the vast majority of positives being misclassifications.
This is a very cool observation and leads amongst others to the conclusions that low-latency anonymity is not dead:
“[...] Second, it gives us a small glimmer of hope that maybe all is not lost against IX, National, or ISP level adversaries. Especially those with only sampled or connection-level resolution, where fine-grained details on inter-packet timings is not available (as will likely be the case with EU data retention).”
This post is of great interest because it re-opens the problem of high-precision traffic analysis, with a clearly understood and precisely known error rate. Current techniques, mostly based on heuristics, are not capable to deliver such detectors, with high reliability guarantees attached to them.
At the same time the The23rd Raccoon’s analysis overlooks one issue: many detectors do not simply perform matching of streams on a pair by pair basis, but as a whole. This means that the “best match” is selected according to some ranking metric. The mathematical analysis that has to then be presented to support the technique is the probability any non-match is selected over the correct match. Many papers in the literature provide implicitly such results (some based on the rank of the result, others based on selecting the best match and providing the total probability of error based on the selection.) Approaches that provide overall performance metrics should avoid the pitfalls of presenting “intermediate” false positive / false negative probabilities, and give an intuitive understanding of how well traffic analysis techniques work on a full body of streams or message traces.
