Back in 2009 we had a close look at the surveillance commisionners reports and the implementation of RIPA Part III that makes failure to decrypt material an offense. Today the BBC is reporting that Oliver Drage, 19, of Liverpool has been convicted for refusing to give police the password to his computer. He is looking at spending 16 weeks in jail, merely for not handing out an encryption key.

BBC journalists, in their usual “impartial” style are quick to report the offence under which Mr Drake was arrested, but of course never convicted of. I will not be repeating it here as it might constitute slander, since the accusation was never in fact show to be true, and it is not even clear if the basis of the original suspicion played any role in the conviction.

The BBC also relays verbatim Det Sgt Neil Fowler, of Lancashire police, as saying: “Drage was previously of good character so the immediate custodial sentence handed down by the judge in this case shows just how seriously the courts take this kind of offence. […] It sends a robust message out to those intent on trying to mask their online criminal activities that they will be taken before the courts with the ultimate sanction, as in this case, being a custodial sentence.”

Of course what the BBC’s impartial style fails to comment on, is that Mr Drake was in fact never shown to be participating in any online criminal activities aside the activity of not revealing his key to the police. At best it sends a robust message that innocent people mindful of their privacy in relation to the state will end up in jail, and at worse it signals to every serious criminal that if they do not reveal their keys they will get off with a light sentence. The police have powers to obtain warrants to enter premisses covertly, install surveillance equipment to retrieve keys, but instead they chose to simply ask the suspect to self incriminate themselves — this is poor policing, and will inevitably lead to travesties of justice.

This is just the beginning of RIPA part III being used, and of course I am looking forward to monitoring the legislation being used against people with legitimate needs for privacy, such as political activists, journalists, lawyers, whistleblowers, etc. Watch this space.

Americans. Attitudes About Internet Behavioral Advertising Practices
Aleecia M. Mcdonald and Lorrie Faith Cranor (Carnegie Mellon University)

This is a very interesting paper on people’s attitudes to behavioural advertising. Researchers used a mix of a small-scale (14 people) study and a larger (100s of people) statistical study. A few findings are remarkable:

  • First, they see that users apply their intuition of off-line ads to the experience of on-line ads — many see on-line ads as a push mechanism and do not realise that data about themselves are collected. They seem to not object in general to the idea of advertising, and consider it as a fact of life, and even see it as ‘ok’ to support services.
  • The landscape of attitudes to behavioural advertising is fascinating. When faced with a description of what behavioural advertising collects, as a hypothetical scenario, and how it functions, a large percentage of users said this is not possible, and some of them even claimed it would be illegal. When it comes to attitudes towards receiving ‘better’ ads only 18% of them liked the idea for web-based services, and 4% for email based services (like hotmail & gmail). In general the authors found that a lot of extremely common practices cause “surprise”.
  • The researchers also looked at the formulation of the text of the NAI site, that offers an opt out from behavioural advertising. They find that what the system does is unclear, even after reading the page where the operation is described.

In general people prefer random ads rather than personal ads, with the exception of contextual ads (like books on on-line book stores). There is still a lot of ignorance about how technical systems work, and education when it comes to privacy and the ability to self-help themselves to protect privacy is clearly not working.

This research is pointing in the direction that the presumed tolerance of users to privacy invasion is due to ignorance of common practices. Once those practices are revealed it produces surprise, and even feeling of betrayal that will not be beneficial to any company and customer confidence.

The potential for abuse is a key challenge when it comes to deploying anonymity systems, and the privacy technology community has been researching solutions to this problem for a long time. Nymble systems allow administrators to blacklist anonymous accounts, without revealing or even knowing their identity.

What is the model: a user registers an account with a service, such as wikipedia. Then the user can use an anonymous channel like Tor, to do operations, like edit encyclopedia articles. This prevents identification of the author, and also bypasses a number of national firewalls that prevent users accessing the service (China for example blocks Wikipedia for some reason). If abuse it detected then the account can be blacklisted, but without revealing which one it was! The transcript of the edit operation is sufficient for preventing any further edits, but without tracing back the original account or network address of the user.

Nymble systems had some limitations. They either required trusted third parties for registration, or they were slow. A new generation of Nymble systems, including Jack, is now addressing these limitations: they use modern cryptographic accumulator constructions that have proofs of non-membership in O(1) time, to prove a hidden identity is not blacklisted. Jack can do authentication in 200ms, and opening a Nymble address in case of abuse in less than 30ms. This is getting real practical, and it is time that Wikipedia starts using this system instead of blacklisting Tor nodes out of fear of abuse.

Other Nymble systems: The original nymble | Newer Nymble | BLAC | Nymbler with VERBS | PEREA. Each of them offers a different trade-off of efficiency and security.

  • Using Social Networks to Harvest Email Addresses by Iasonas Polakis, Georgios Kontaxis, Eleni Gessiou, Thanasis Petsas, Evangelos P. Markatos and Spiros Antonatos (Institute of Computer Science, Foundation for Research and Technology Hellas)
  • Turning Privacy Leaks into Floods: Surreptitious Discovery of Social Network Friendships and Other Sensitive Binary Attribute Vectors by Arthur Ascuncion and Michael Goodrich (University of California, Irvine) (not on-line yet).

The first work by Polakis et al, looks at how easy it is to massively harvest email addresses using social networks and search engines to further use them as targets for spam. Furthermore, they attach to each email address contextual social information and personal information to produce more convincing spam and phishing emails. For this they used different techniques on three target platforms.

On facebook, they use the facility that allows users to find other by email to harvest personal information. This acts as an oracle to map harvested email addresses to real world names and demographic information. Once a facebook profile is linked, a very convincing fishing email can be crafted – including an email to ask to befriend them. (About 30% of users would befriend a stranger in that manner – a result not reported in the paper.)

A second vector of information is the use of nicknames that are constant across different sites. They use twitter to harvest pairs of (nicknames, email) and then further use the Facebook email to name oracle to link them to real world addresses. Finally, the authors use a google buzz feature to extract emails: every Buzz user ID is also their gmail address – this means that by searching buzz for particular words you can harvest gmail addresses as well as personal information of the users.

But how effective are the email harvesting techniques? How do you even assess this? The authors check the name and address they have harvested against an exact match with the name extracted from Facebook. The first technique yields about 0.3% correct addresses, the second 7%, and the final one 40%, showing that the techniques are practical when linking email to real names.

The second paper by Ascuncion et al. looks at how to aggregate information leaked by social networks to construct complete profiles of users on Social Networks. The aim is to reconstruct the friendship network as well as recovering attributes even if the privacy settings are used.

The techniques assume you can use an oracle to ask group queries against the social network site to check for a particular attribute. The objective is then to find a scheme that minimises the number of queries. It turns out there is a body of work on combinatorial group testing, including adaptive variants, that are readily applicable to this problem. This is not unlike our work on prying data out of a social network. Applying these techniques to social networks is even narrower allowing a lower number of queries to extract attributes (a logarithmic number of queries in the size of possible profiles, and linear in the number of profiles with a certain attribute to be extracted).

The attack is applied and validated by applying it to extract friends in Facebook, DNA sequences in mitochondrial databases, and movie preferences in the NetFlix database. These techniques are interesting as they are very general. At the same time it is likely that faster ways exist to extract specific attributes of users in real-world social networks, as there are strong correlation between attributes and the social structure of users.

I am just sitting in the first WPES10 talk:

Balancing the Shadows by Max Schuchard, Alex Dean, Victor Heorhiadi, Yongdae Kim, and Nicholas Hopper (University of Minnesota)

ShadowWalker is a peer-to-peer anonymity system designed by Prateek Mittal (who was our intern in 2008) and Nikita Borisov to prevent corrupt peers jeopardising the network. The authors of this new paper “Balancing the shadows” present an attack on the system, where a malicious coalition of nodes can compromise routing security and can bias the probability of choosing a malicious node as a relay. It turns out that a naïve fix opens the system instead to selective denial-of-service attack.

How does the eclipse attack on ShadowWalker work? The adversary controls a full neighbourhood of the network, i.e. a sequence of peers in the distributed hash table (DHT). This allows an adversary to corrupt the “shadow” mechanism in shadow walker. When Alice asks a malicious node in this neighbourhood about another node in the network, they can provide a false ID, along with a set of false shadows. This attack is not too bad on its own, except that the same mechanism is used during the construction of the routing tables of the DHT. As a result an adversary that controls about 10% of the nodes can corrupt about 90% of the circuits, after a few rounds of the protocol (this was backed by simulations).

How to fix the attack? Can we increase the number of shadows of each node that can testify of the correctness of its ID? It turns out this is not a good idea: the more shadows the higher the probability one of them is malicious. In that case they can maliciously refuse to attest honest nodes, effectively taking them out of the protocol. The authors propose to change the protocol to only require a fraction of shadows providing signatures to attest an ID-node relationship — time will show if this withstands attacks.

What do we learn from this: first the level of security in peer-to-peer anonymity systems is still questionable, as designs keep being proposed and broken on a yearly basis. Second, it highlights that DHT based designs inherit the characteristic that routing tables are designed as part of the protocol. This offers the adversary an opportunity of amplify their attacks. Designs should therefore not consider that the DHT is in an honest steady-state, but instead consider attacks at the time of network formation. Finally, it is worth keeping in mind that these systems try to prevent adversaries using a small fraction of malicious nodes (5%-20%) to compromise the security of a large fraction of the network. This is still far from our hope that peer-to-peer anonymity could withstand large Sybil attacks where the adversary controls a multiple of honest nodes.